Staff Backline Engineer - Data & AI

Databricks Databricks · Data AI · United States · Support

Staff Backline Engineer role at Databricks focused on deep-dive troubleshooting, root cause analysis, and architectural optimization within the Databricks Data and AI ecosystem. The role involves developing automated workflows and AI-driven diagnostic tools to improve supportability and scale the organization. Requires expertise in either Data Engineering, Product Supportability, or the AI track (ML/GenAI systems, LLMs, agentic workflows).

What you'd actually do

  1. Conduct deep-dive forensics into Spark core internals and the broader Databricks Data and AI ecosystem to resolve high-priority architectural failures and complex system anomalies.
  2. Perform advanced code-level analysis and resource profiling to identify and mitigate systemic root causes, ensuring the stability and reliability of high-scale production workloads.
  3. Optimise architectural performance across the Data and AI stack by refining execution parameters and enforcing best practice strategies to maximise resource efficiency and throughput.
  4. Analyse global issue trends and patterns to partner directly with Product Engineering, influencing the product roadmap and driving initiatives that enhance long-term supportability.
  5. Develop reproduction frameworks, automated workflows, and AI-driven diagnostic tools that translate complex backline findings into standardised resolution paths to empower and scale the broader organisation.

Skills

Required

  • 10+ years of relevant experience
  • Deep expertise in Data Engineering (Spark, Delta Lake, Hive, Python, SQL, Scala) OR Product Supportability (distributed system internals, Java, Scala, Python) OR AI (large-scale ML, generative AI, LLMs, agentic workflows, ML lifecycle, distributed ML optimization)
  • Troubleshooting failures
  • Diagnosing performance issues
  • Identifying root causes
  • Code-level analysis
  • Resource profiling
  • Distributed systems

Nice to have

  • Customer success
  • Technical stakeholders management
  • Automation and tooling development
  • AI-driven diagnostic tools

What the JD emphasized

  • deep expertise in one of the following three specialized tracks
  • excellence in one area rather than proficiency in all

Other signals

  • Troubleshooting complex AI systems
  • Optimizing ML performance
  • Developing AI diagnostic tools