Senior Software Engineer, Data Governance & Foundations

Instacart Instacart · Consumer · United States · Remote · Software Engineering

Instacart is seeking a Senior Software Engineer for their Data Governance & Foundations team. This role focuses on building and operating core systems for the company's data ecosystem, including a large-scale data lakehouse, ingestion, stream processing, and self-serve tooling. The engineer will define multi-year architecture roadmaps, own platform initiatives, partner with vendors, embed governance and compliance controls, optimize infrastructure spend, and mentor other engineers. The role requires 5+ years of experience in data infrastructure or distributed systems, with expertise in modern data lakehouse architectures, event-driven infrastructure (Kafka, Flink), and distributed query systems (Trino, Spark). Experience with data governance frameworks and FinOps is preferred.

What you'd actually do

  1. Define and drive multi-year architecture roadmaps for large-scale data ingestion and processing infrastructure, setting technical direction that balances reliability, scalability, and cost.
  2. Own end-to-end platform initiatives — from build vs. buy decisions and migration design through production rollout and risk management — across Kafka-based streaming and Postgres-based systems.
  3. Partner with vendors (Snowflake, Databricks, Confluent) on technical integration, contract evaluation, and TCO modeling to inform infrastructure investment decisions.
  4. Collaborate with various teams to embed governance and compliance controls (SOX, CPRA, GDPR) directly into platform architecture and data lifecycle management.
  5. Optimize infrastructure spend at scale: identify cost reduction opportunities across compute, storage, and pipeline efficiency; manage multi-million dollar infrastructure budgets.

Skills

Required

  • 5+ years of software engineering focused on data infrastructure or distributed systems at scale
  • modern data lakehouse architectures and open table formats — Apache Iceberg, Delta Lake, Hudi
  • distributed query and compute systems (Trino, Spark, ClickHouse)
  • event-driven infrastructure: Kafka for high-throughput data ingestion and Flink (or equivalent) for stream processing at scale
  • owning and executing major platform transitions
  • building business cases for infrastructure investments
  • written technical communication

Nice to have

  • data governance and compliance frameworks (SOX, CPRA, GDPR)
  • FinOps and data platform cost optimization
  • managing large infrastructure budgets
  • negotiating enterprise vendor contracts
  • Deep SQL expertise
  • Python or Scala for systems-level work
  • orchestration (Apache Airflow)
  • transformation pipelines (dbt)

What the JD emphasized

  • Define and drive multi-year architecture roadmaps
  • Own end-to-end platform initiatives
  • Partner with vendors
  • Collaborate with various teams to embed governance and compliance controls
  • Optimize infrastructure spend at scale
  • 5+ years of software engineering focused on data infrastructure or distributed systems at scale
  • modern data lakehouse architectures
  • event-driven infrastructure
  • Track record owning and executing major platform transitions
  • Experience building business cases for infrastructure investments
  • Exceptional written technical communication
  • Strong ownership and comfort operating in ambiguity