Staff Software Engineer (data Platform)

Databricks Databricks · Data AI · Bangalore, India · Engineering - Pipeline

Staff Software Engineer on the Data Platform team at Databricks, responsible for building the Data Intelligence Platform. This involves designing and running the metrics store, cross-company data intelligence platform, tooling for managing Databricks infrastructure, ETL frameworks, data pipelines, and APIs for telemetry and event logs. The role requires experience in large-scale distributed systems, data pipelines, and orchestration frameworks.

What you'd actually do

  1. Design and run the Databricks metrics store that enables all business units and engineering teams to bring their detailed metrics into a common platform for sharing and aggregation, with high quality, introspection ability and query performance.
  2. Design and run the cross-company Data Intelligence Platform, which contains every business and product metric used to run Databricks. You’ll play a key role in developing the right balance of data protections and ease of shareability for the Data Intelligence Platform as we transition to a public company.
  3. Develop tooling and infrastructure to efficiently manage and run Databricks on Databricks at scale, across multiple clouds, geographies and deployment types. This includes CI/CD processes, test frameworks for pipelines and data quality, and infrastructure-as-code tooling.
  4. Design the base ETL framework used by all pipelines developed at the company.
  5. Partner with our engineering teams to provide leadership in developing the long-term vision and requirements for the Databricks product.

Skills

Required

  • 12+ years of industry experience
  • 4+ years of experience building large scale distributed systems
  • 5+ providing technical leadership on large projects similar to the ones described above - ETL frameworks, metrics stores, infrastructure management, data security.
  • Experience building, shipping and operating reliable multi-geo data pipelines at scale.
  • Experience working with and operating workflow or orchestration frameworks, including open source tools like Airflow and DBT or commercial enterprise tools.
  • Experience with large-scale messaging systems like Kafka or RabbitMQ or commercial systems.
  • Excellent cross-functional and communication skills, consensus builder.

Nice to have

  • Passion for data infrastructure and for enabling others by making their data easier to access.

What the JD emphasized

  • 12+ years of industry experience
  • 4+ years of experience building large scale distributed systems
  • 5+ providing technical leadership on large projects similar to the ones described above - ETL frameworks, metrics stores, infrastructure management, data security.
  • Experience building, shipping and operating reliable multi-geo data pipelines at scale.
  • Experience working with and operating workflow or orchestration frameworks, including open source tools like Airflow and DBT or commercial enterprise tools.
  • Experience with large-scale messaging systems like Kafka or RabbitMQ or commercial systems.