Staff Software Engineer, Data Governance & Foundations

Instacart Instacart · Consumer · United States · Remote · Software Engineering

Staff Software Engineer to join Data Governance and Foundations Team. Owns architecture and delivery of open lakehouse foundation, governance and access patterns, and multi-engine compute strategy. Drives real-time and streaming infrastructure for critical use cases. Pioneers AI-native data infrastructure engineering by applying LLM/AI tools to the platform lifecycle and embedding AI-powered capabilities. Elevates engineering excellence through architecture reviews and mentorship.

What you'd actually do

  1. Translate Instacart’s data strategy (e.g., monetization, federated access, real-time) into an actionable multi-year architecture roadmap; align with leadership while evolving the platform for scale, maturity, and cost efficiency.
  2. Own the open lakehouse foundation: define and deliver unified table formats, storage governance, and a multi-engine compute portfolio (interactive, batch, streaming) that enables portability and prevents lock-in.
  3. Drive real-time and streaming infrastructure for critical use cases (Ads, Fraud, ML): set deployment patterns, SLAs, and operational practices that balance performance, availability, and spend.
  4. Pioneer AI-native data infrastructure engineering by applying LLM/AI tools to the platform lifecycle—accelerating development, automation, observability, and cost optimization—and partnering to embed AI-powered capabilities into the platform.
  5. Elevate engineering excellence: lead architecture reviews, mentor senior/staff engineers, influence hiring, and clearly communicate complex trade-offs to both technical and executive audiences to ensure cross-org alignment.

Skills

Required

  • 5+ years of software engineering experience building and operating data infrastructure or distributed systems at production scale.
  • Hands-on expertise with modern data lakehouse architectures and open table formats (e.g., Apache Iceberg, Delta Lake, Hudi) and with distributed query/compute engines (e.g., Trino, Spark, ClickHouse), including performance tuning and production reliability.
  • Experience with event-driven and streaming infrastructure (e.g., Kafka, Flink) for real-time pipelines and serving systems.
  • Proven ownership of major platform transitions or migrations (build vs. buy, migration design, risk management) delivered to production.
  • Ability to build cost/benefit and TCO models for infrastructure investments and to drive alignment via clear architecture docs and strategy memos across multiple teams and leadership levels.

Nice to have

  • Experience designing platform-level governance controls and familiarity with compliance frameworks (e.g., SOX, CPRA, GDPR).
  • FinOps experience optimizing data platform spend, including managing multi-million dollar infrastructure budgets and negotiating vendor contracts.
  • Deep SQL proficiency and strong skills in Python or Scala for systems-level development.
  • Experience with orchestration (e.g., Apache Airflow) and data transformation pipelines (e.g., dbt) in large-scale production environments.
  • Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, Electrical Engineering, or equivalent practical experience.

What the JD emphasized

  • production scale
  • production reliability
  • major platform transitions or migrations
  • multi-million dollar infrastructure budgets