Principal Devleopment Engineer

CVS Health CVS Health · Healthcare · Work at Home, TX +49 · Innovation and Technology

Principal Software Development Engineer to own the AI Platform within HCD, focusing on LLM gateway strategy, Model Context Protocol (MCP) design, reference architectures, and the Agent Development Lifecycle (ADLC). The role involves leading architectural decisions for agents, setting engineering standards, driving complex problems, and ensuring quality gates, eval framework design, and production readiness. Responsibilities include owning observability, building CI/CD pipelines with LLM evaluations, defining AI SDLC standards, and managing AI spend and metrics. The role requires deep expertise in Python, cloud infrastructure (Azure), API gateway patterns, and LLM routing, with a focus on sensitive data handling and compliance in a regulated environment.

What you'd actually do

  1. Own the LLM gateway strategy end-to-end — including model access governance, latency benchmarking across routing layers
  2. Own and continuously evolve the ADLC framework — the team's standard for taking agents from use case discovery through infrastructure planning, evaluation design, development, testing, and production deployment
  3. Provide principal-level technical leadership across the active agent portfolio.
  4. Own the observability mandate — driving adoption across all production agents and defining eval standards that reflect real business outcomes, not just infrastructure uptime
  5. Maintain the AI spend tracking system — cost per team, cost per agent, cost per model — with automated reporting for senior leadership

Skills

Required

  • 9+ years of professional software engineering experience
  • Deep expertise in Python
  • strong cloud infrastructure experience
  • hands-on familiarity with API gateway patterns
  • LLM routing
  • Demonstrated ability to influence architecture and engineering standards across multiple teams and work streams without direct management authority

Nice to have

  • 3+ years in a principal, staff, or equivalent senior individual contributor role
  • Hands-on experience with AI observability tooling such as Arize, LangSmith, or Phoenix
  • ability to design eval pipelines that measure what actually matters for business outcomes
  • Proven track record designing and operating production AI systems — LLM APIs, agent frameworks, RAG architectures, or RPA automation — at meaningful scale
  • Strong working knowledge of security and compliance requirements for AI systems handling sensitive data, including PHI and PII handling in a regulated environment
  • Azure experience

What the JD emphasized

  • AI Platform Architecture
  • Agent Development Lifecycle (ADLC) Ownership
  • Production Agent Portfolio Technical Leadership
  • Observability, Evals & Quality
  • FinOps, Metrics & Governance
  • LLM gateway strategy
  • Model Context Protocol (MCP)
  • reference architectures
  • agent delivery
  • Skills and Plugin Marketplace strategy
  • ADLC framework
  • engineering quality gates
  • eval framework design
  • hallucination rate targets
  • adversarial test suites
  • business outcome metrics
  • structured rollback procedures
  • canary deployment patterns
  • production readiness standards
  • cross-cutting technical challenges
  • sensitive data handling
  • integration patterns
  • BrowserUse automation reliability
  • RAG retrieval quality at scale
  • latency under real practical scenarios
  • deployment automation strategy
  • observability mandate
  • eval standards
  • CI and CD pipelines
  • LLM evaluations
  • prompt versioning
  • hallucination and failure rate tracking
  • regression test suites
  • AI SDLC
  • guardrails infrastructure
  • policy enforcement
  • AI quality pipeline
  • AI spend tracking system
  • AI metrics dashboard
  • compliance and security design leadership
  • PHI and PII handling
  • regulated environment
  • Deep expertise in Python
  • strong cloud infrastructure experience
  • API gateway patterns
  • LLM routing
  • Demonstrated ability to influence architecture and engineering standards
  • Hands-on experience with AI observability tooling
  • design eval pipelines that measure what actually matters for business outcomes
  • Proven track record designing and operating production AI systems
  • LLM APIs
  • agent frameworks
  • RAG architectures
  • RPA automation
  • meaningful scale
  • Strong working knowledge of security and compliance requirements for AI systems handling sensitive data

Other signals

  • AI Platform Architecture
  • Agent Development Lifecycle
  • Production Agent Portfolio Technical Leadership
  • Observability, Evals & Quality
  • FinOps, Metrics & Governance