Executive Director, AI Infrastructure & Platform Engineering

CVS Health CVS Health · Healthcare · Work at Home, NY +51 · Innovation and Technology

Executive Director role responsible for building and operating CVS Health's on-premises AI compute platform, including GPU clusters, networking, storage, and orchestration layers. Focuses on infrastructure, reliability, security, and compliance (HIPAA, NIST AI RMF) for frontier AI workloads.

What you'd actually do

  1. Define and execute the long-range vision and strategy for AI infrastructure and platform engineering, with availability (>99.99%), reliability, and platform performance as the primary measures of success.
  2. Recruit, hire, develop, and retain a high-performing engineering organization spanning infrastructure, network, platform reliability, observability, security, 24/7 operations, change and release management, and FinOps.
  3. Own the physical layer of the AI compute environment — GPU compute, storage, network fabric, capacity planning, and hardware lifecycle accountability.
  4. Direct bare-metal Kubernetes and OpenShift operations, including cluster administration, GPU quota governance, infrastructure-as-code adoption, and availability baseline enforcement.
  5. Build and sustain a high-performing 24/7 operations model — designed for sustainable, predictable coverage with no mandatory overtime and measurable team health and retention.

Skills

Required

  • AI infrastructure management
  • Platform engineering
  • GPU compute management
  • Network fabric operations (RoCE v2, spine-leaf)
  • Bare-metal Kubernetes/OpenShift operations
  • Site Reliability Engineering (SRE)
  • Observability and monitoring
  • Change and release management
  • FinOps
  • Security posture management
  • Compliance frameworks (HIPAA, NIST AI RMF)
  • Data center operations
  • Capacity planning
  • Hardware lifecycle management
  • Team leadership and hiring

Nice to have

  • NVIDIA Blackwell systems

What the JD emphasized

  • frontier-class GPU compute environment
  • greenfield organizational build
  • HIPAA
  • NIST AI RMF
  • 10+ years of engineering leadership experience, with substantial time directly owning physical infrastructure at data center scale

Other signals

  • standing up, operating, and continuously improving CVS Health's on-premises AI compute platform
  • frontier-class GPU compute environment
  • greenfield organizational build
  • define the operating model, set the engineering standards, hire and develop the team
  • high-performance 24/7 operations model
  • robust compliance to frameworks including HIPAA and NIST AI RMF