Staff Data Engineer

CVS Health CVS Health · Healthcare · Work at Home, TX +49 · Innovation and Technology

Staff Data Engineer at CVS Health responsible for architecting and building petabyte-scale data pipelines and self-service data platforms. The role involves modernizing data infrastructure towards a Data Mesh approach, developing internal tools and APIs, and integrating AI tooling to accelerate the data engineering SDLC. The primary focus is on data ingestion, access, and management, enabling data owners to manage their data quality.

What you'd actually do

  1. Engineer scalable, reliable, and performant data pipelines to assemble large and intricate datasets using SQL, DBT, and Snowflake, ensuring high data availability and integrity.
  2. Independently design and maintain internal React (TypeScript) interfaces and Python backend services that automate data ingestion and discovery, reducing lead times for application teams from weeks to minutes.
  3. Build and maintain production-grade REST and gRPC APIs that serve as the high-performance interface between our Snowflake data layer and downstream consumer touchpoints.
  4. Implement a GitOps model for data using GitHub Actions and Argo/Kargo, integrating standardized logging, alerting, and automated observability into the heart of all data products.
  5. Leverage Cursor AI, MCPs, and other AI tooling to accelerate the data engineering SDLC, from optimizing complex SQL queries to automating schema migrations.

Skills

Required

  • Python
  • SQL
  • Cloud data warehouses (Snowflake, AWS, GCP)
  • ETL/ELT pipeline development
  • Data modeling

Nice to have

  • DBT
  • React
  • RESTful APIs
  • GitHub Actions
  • GitOps
  • Argo/Kargo
  • Kubernetes
  • Messaging and streaming (Kafka, SNS, RabbitMQ)
  • Cursor AI
  • GitHub CoPilot
  • Observability tools (metrics, logging, alerting)
  • Data structures
  • Algorithms
  • Async programming
  • Parallel programming
  • HL7 V2.x
  • FHIR

What the JD emphasized

  • 7+ years of experience in Data Engineering with a heavy focus on Python as the primary scripting and backend language.
  • 7+ years of experience with SQL and cloud data warehouses (e.g Snowflake, AWS, GCP, etc.)
  • 7+ years of experience building high-volume ETL/ELT pipelines and data modeling.