Sr. Staff Software Engineer — Observability, Insights & Governance

Databricks Databricks · Data AI · Mountain View, CA · Engineering

Databricks is seeking a Sr. Staff Software Engineer to lead strategy and execution for observability, insights, and governance on their data and AI platform. The role involves setting technical direction, designing high-impact projects, building next-generation agentic experiences for diagnostics and action, and partnering with various teams to integrate visibility and governance. The engineer will mentor senior staff, set technical standards, and champion reliable software and operational practices. Experience with large-scale distributed systems, observability, governance platforms, or databases is required, with a bonus for experience building agentic experiences using LLMs.

What you'd actually do

  1. Establish the long-term technical direction across query observability, warehouse management, and account-level governance, and own the multi-year architecture all three surfaces are built on.
  2. Design and lead high-impact projects that move the needle on performance, reliability, and cost transparency for customers.
  3. Build the next generation of agentic observability and governance — extensible experiences that let users and automated agents investigate, diagnose, and act on telemetry across query, warehouse, and account.
  4. Partner deeply with Databricks SQL, Unity Catalog, AI, security, and platform teams to integrate end-to-end visibility and governance across the data plane and control plane.
  5. Mentor senior engineers, set the technical standards across the group, and recruit top talent into observability, insights, and governance.

Skills

Required

  • 12+ years building and operating large-scale distributed systems, observability or governance platforms, databases, or backend infrastructure.
  • Proven track record as a technical leader on teams operating in complex, multi-stakeholder environments
  • Deep computer science fundamentals (algorithms, data structures, systems design)
  • Experience with one or more of: observability and telemetry pipelines, query engines and query analytics, distributed tracing and metrics, time-series storage, data governance, or large-scale logging systems.
  • Strong cross-functional communication skills

Nice to have

  • experience building agentic experiences using LLMs

What the JD emphasized

  • agentic experiences
  • observability
  • governance
  • large-scale distributed systems
  • technical leader

Other signals

  • building agentic experiences
  • observability and governance for AI platform
  • large-scale distributed systems