Staff Designated Support Engineer

Databricks Databricks · Data AI · India · Engineering

Staff Designated Support Engineer at Databricks, focusing on high-touch specialized support and technical solutions for large customers in the Digital Native Business segment. The role involves advanced troubleshooting of Spark, SQL, Delta, Streaming, and Databricks runtime features, including Mosaic AI Model Service, building POCs, and training customers on best practices for Spark/ML/AI workflows. Requires deep expertise in Big Data, Spark, Data Engineering, AI ecosystems, cloud platforms, and CI/CD, with significant customer-facing experience.

What you'd actually do

  1. Perform advanced Troubleshooting and Root Cause Analysis to resolve performance and reliability issues in Spark, SQL, Delta, Streaming, and Databricks runtime features using tools like Spark UI metrics, Mosaic AI Model Service, DAGs, and event logs.
  2. Build Rapid POCs, Test/Deploy/Monitor the solutions built by Databricks Engineering to address customer challenges and showcase advanced Spark/ML/AI runtime capabilities aligned with their business goals.
  3. Develop comprehensive playbooks and maintain a knowledge base of common issues and solutions for Spark, ML, and AI workflows.
  4. Train customer engineering and business teams on best practices in performance tuning, debugging, and effectively leveraging Databricks Features.
  5. Advocate for customers in business review meetings and maintain close relationships as a trusted advisor and primary technical point of contact.

Skills

Required

  • 8–12 years of experience designing, building, and troubleshooting distributed computing applications
  • 4+ years delivering production-scale Spark/ML/AI solutions using Python, Java, or Scala
  • Hands-on expertise with Data Lakes, SQL-based databases, and Cloud-based Data Warehousing/ETL tools like Snowflake, Redshift, Bigquery, etc
  • Deep knowledge of Spark core internals, Delta/Iceberg, JVM optimization, and memory management
  • Proficiency in AI ecosystems like Machine Learning, Deep Learning, and Generative AI
  • Practical experience with AWS, Azure, or GCP
  • Expertise in building and managing CI/CD pipelines, monitoring, and alerting systems
  • 3–5 years in customer-facing roles such as Technical Account Manager or Solutions Architect
  • Strong communication, relationship-building, and problem-solving skills
  • Proven ability to anticipate, identify, and mitigate risks while planning solutions for production challenges
  • Proven ability to work with cross-functional teams and senior leadership

Nice to have

  • Apache Spark™
  • Mosaic AI Model Service
  • DAGs
  • event logs
  • Python
  • Java
  • Scala
  • Snowflake
  • Redshift
  • Bigquery
  • Delta/Iceberg
  • JVM optimization
  • memory management
  • AWS
  • Azure
  • GCP
  • CI/CD pipelines
  • monitoring
  • alerting systems
  • Technical Account Manager
  • Solutions Architect

What the JD emphasized

  • complex product issues
  • performance and reliability issues
  • Mosaic AI Model Service
  • advanced Spark/ML/AI runtime capabilities
  • performance tuning
  • debugging
  • Big Data and Spark
  • production-scale Spark/ML/AI solutions
  • Spark core internals
  • AI ecosystems
  • Machine Learning
  • Deep Learning
  • Generative AI
  • monitoring, and alerting systems
  • Advanced Proactive Problem Solving Skills
  • production challenges

Other signals

  • customer-facing role
  • technical solutions
  • complex product issues
  • performance and reliability issues
  • Spark, SQL, Delta, Streaming, and Databricks runtime features
  • Mosaic AI Model Service
  • Rapid POCs
  • Test/Deploy/Monitor
  • Spark/ML/AI runtime capabilities
  • knowledge base
  • best practices
  • performance tuning
  • debugging
  • cross-functional teams
  • technical presentations
  • production-impacting issues
  • Big Data and Spark
  • distributed computing applications
  • production-scale Spark/ML/AI solutions
  • Data Engineering Specialization
  • Cloud-based Data Warehousing/ETL tools
  • Spark core internals
  • Delta/Iceberg
  • JVM optimization
  • memory management
  • AI ecosystems
  • Machine Learning
  • Deep Learning
  • Generative AI
  • Cloud and CI/CD Skills
  • monitoring, and alerting systems
  • Customer-Facing Experience
  • Technical Account Manager
  • Solutions Architect
  • Proactive Problem Solving Skills
  • anticipate, identify, and mitigate risks
  • planning solutions for production challenges
  • Collaboration and Leadership