What you'd actually do

Deliver the infrastructure vision for systems processing billions in daily billing transactions with zero tolerance for error, building disaster recovery that's provably reliable, testing frameworks that catch what production sees, correctness systems that make billing errors structurally impossible, and observability that predicts failures before they happen

Build Bengaluru's data infrastructure organization by establishing it as the destination for India's top infrastructure talent, hiring multiple engineering managers who become force multipliers, and creating a culture where solving hard distributed systems problems at scale is the daily work

Own business-critical systems operating 24/7/365 across 100+ regions where even 99.9% uptime means hours of customer pain, driving reliability improvements that prevent millions in revenue loss while eliminating operational toil through frameworks that make systems self-healing, self-tuning, and self-documenting

Ship platforms that compound engineering leverage across Databricks: correctness frameworks that catch billing errors before customers do, deployment automation that makes regional expansion push-button, data integration systems that process petabyte-scale flows without human intervention, and testing infrastructure where comprehensive coverage is automatic, not heroic

Position infrastructure as product by treating internal engineering teams as customers with SLAs, measuring adoption and satisfaction, iterating based on feedback, and demonstrating that every dollar invested in infrastructure returns multiplicative gains in product velocity, reliability improvements, or cost reductions

Skills

Required

14+ years in distributed systems engineering
6+ years leading infrastructure organizations
4+ years managing managers
Experience building 99.999%+ reliable systems
Proven ability to scale infrastructure organizations in high-growth environments
Communication skills to make complex infrastructure decisions legible to executives

Nice to have

Technical depth across petabyte-scale data pipelines and distributed systems reliability
Track record defining multi-year infrastructure vision and translating it into sequential deliverables
Established practices for SLOs/SLIs, chaos engineering, disaster recovery, and sophisticated observability
Developed engineering managers
Created teams where retention is high because the problems are interesting and the culture is strong
Influence cross-functional teams

(P-1384)

Databricks processes petabytes of data and billions of transaction events daily - every cluster launch, every query executed, every dollar billed flows through infrastructure that must never fail. When we process billions in billing transactions with 99.999% accuracy requirements, when we ingest terabytes per second across 100+ regions, when a five-minute outage costs millions in revenue and customer trust - infrastructure isn't just important, it's existential. The next phase of our growth demands disaster recovery systems that prove reliability rather than hope for it, testing frameworks that catch production-scale problems before deployment, correctness guarantees that make billing errors structurally impossible, and automation that scales operations sublinearly with growth.

In this leadership opportunity, you will build the data infrastructure organization that makes Databricks' continued growth possible. You'll establish foundational teams in Bengaluru owning the bedrock systems that guarantee billing correctness, operational resilience, and zero-downtime recovery across our entire monetization stack, alongside multi-region data ingestion, developer platforms, and deployment automation that eliminate friction at petabyte scale. This isn't about maintaining what exists; it's about architecting the infrastructure that enables Databricks to scale while reducing operational burden. You'll define what world-class infrastructure looks like for the next decade of data platforms.

You will pursue these challenges as a founding technical leader in our fastest-growing engineering hub and strategic partner to global infrastructure leaders. In addition to building world-class teams, you will shape architectural decisions that ripple across the company and champion infrastructure-as-product thinking that transforms infrastructure into force multipliers globally. You'll work in an engineering culture born from Apache Spark and open source, where technical depth matters and infrastructure engineers are celebrated as craftspeople.

The perfect candidate has built infrastructure organizations at companies where five nines weren't simply aspirational, where petabyte-scale wasn't marketing but Monday, and where the infrastructure team's technical leverage determined whether the business could scale or stall. You have the technical depth to debate data architecture, the strategic vision to define multi-year platform roadmaps, the leadership craft to build teams that top engineers want to join, and most importantly, the conviction that data infrastructure done right doesn't just support the business; it defines what's possible.

The impact you’ll have:

**Deliver the infrastructure vision for systems processing billions in daily billing transactions **with zero tolerance for error, building disaster recovery that's provably reliable, testing frameworks that catch what production sees, correctness systems that make billing errors structurally impossible, and observability that predicts failures before they happen
Build Bengaluru's data infrastructure organization by establishing it as the destination for India's top infrastructure talent, hiring multiple engineering managers who become force multipliers, and creating a culture where solving hard distributed systems problems at scale is the daily work
Own business-critical systems operating 24/7/365 across 100+ regions where even 99.9% uptime means hours of customer pain, driving reliability improvements that prevent millions in revenue loss while eliminating operational toil through frameworks that make systems self-healing, self-tuning, and self-documenting
Ship platforms that compound engineering leverage across Databricks: correctness frameworks that catch billing errors before customers do, deployment automation that makes regional expansion push-button, data integration systems that process petabyte-scale flows without human intervention, and testing infrastructure where comprehensive coverage is automatic, not heroic
Position infrastructure as product by treating internal engineering teams as customers with SLAs, measuring adoption and satisfaction, iterating based on feedback, and demonstrating that every dollar invested in infrastructure returns multiplicative gains in product velocity, reliability improvements, or cost reductions

What you’ll need:

14+ years in distributed systems engineering with 6+ years leading infrastructure organizations and 4+ years managing managers at companies where infrastructure failures meant immediate revenue impact, customer escalations, or regulatory consequences - and you built the systems and teams that made those failures rare
Technical depth across petabyte-scale data pipelines and distributed systems reliability where you can engage from "how should we architect multi-region disaster recovery" to "why is this Kafka cluster exhibiting this latency pattern" while knowing when to coach versus when to decide
Track record defining multi-year infrastructure vision and translating it into sequential deliverables that show value quarterly while building toward architectural end states, positioning infrastructure investments as business enablers rather than cost centers, and making build-vs-buy decisions that compound over time
Experience building 99.999%+ reliable systems with established practices for SLOs/SLIs, chaos engineering, disaster recovery, and sophisticated observability that predicts failures before they happen
Proven ability to scale infrastructure organizations in high-growth environments where you've doubled engineering while maintaining quality bar, developed engineering managers, and created teams where retention is high because the problems are interesting and the culture is strong
Communication skills to make complex infrastructure decisions legible to executives (translating technical investments into business outcomes), influence cross-functional partners without authority, build trust across global teams in different timezones with different working styles, and represent Databricks' technical brand externally
BS in Computer Science or Engineering; MS or Ph.D. preferred. Experience with Apache Spark, Delta Lake, large-scale data infrastructure, fintech/billing systems, or leading infrastructure through hypergrowth strongly preferred

About Databricks

Databricks is the data and AI company. More than 10,000 organizations worldwide — including Comcast, Condé Nast, Grammarly, and over 50% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn and Facebook.

**Benefits

**At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of our employees. For specific details on the benefits offered in your region click here.

Our Commitment to Diversity and Inclusion

At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics.

Compliance

If access to export-controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to proceed with an applicant on this basis alone.