Sr Backline Engineer (apache Spark™)

Databricks Databricks · Data AI · India · Support

This role focuses on deep code-level analysis and troubleshooting of Apache Spark, Spark SQL, Structured Streaming, and Databricks Delta to resolve complex customer issues. The engineer will act as a technical bridge between support and engineering, provide best practices, contribute to automation, identify bugs, and coordinate issue resolution. Experience in developing, testing, and sustaining Python, Java, or Scala-based applications, navigating Spark source code, and Big Data/Hadoop/Spark/Kafka/Elasticsearch pipelines is required.

What you'd actually do

  1. Troubleshoot, resolve and suggest deep code-level analysis of Spark to address complex customer issues related to Apache Spark™ core internals, Spark SQL, Structured Streaming and Databricks Delta.
  2. Provide best practices guidance around Spark runtime performance and usage of Spark core libraries and APIs for custom-built solutions developed by Databricks customers.
  3. Help the support team with detailed troubleshooting guides and runbooks.
  4. Contribute to automation and tooling programs to make daily troubleshooting efficient.
  5. Work with the Spark Engineering Team and spread awareness of upcoming features and releases.

Skills

Required

  • Apache Spark core internals
  • Spark SQL
  • Structured Streaming
  • Databricks Delta
  • Python
  • Java
  • Scala
  • Big Data
  • Hadoop
  • Kafka
  • Elasticsearch
  • SQL-based database systems
  • JVM troubleshooting
  • GC troubleshooting
  • Thread dump troubleshooting
  • AWS
  • Azure

What the JD emphasized

  • 8+ years of industry experience developing, testing, and sustaining Python or Java or Scala-based applications.
  • Comfortable with compiling, building and navigating the Apache Spark source code.
  • Comfortable with identifying and applying patches/bug fixes to the Apache Spark source code.
  • Experience in Big Data/Hadoop/Spark/Kafka/Elasticsearch data pipelines.
  • Experience in JVM, GC, Thread dump-based troubleshooting is required.