Technical Program Manager, Inference

Weights & Biases Weights & Biases · Data AI · Bellevue, WA +3 · Technology

CoreWeave is seeking a Technical Program Manager (TPM) focused on inference to join their AI/ML Platform Services team. This role will drive end-to-end program management for inference platform initiatives, including reliability, customer onboarding, launch readiness, and runtime optimization. The TPM will lead cross-functional programs to ensure the successful delivery of scalable, reliable, and high-performance inference services, working closely with engineering, product, and infrastructure teams to improve how these services are launched, onboarded, operated, and optimized. The role requires strong technical fluency in distributed inference systems, GPU compute, and cloud-native architectures, with a focus on measurable improvements in reliability, performance, and customer delivery.

What you'd actually do

  1. Drive end-to-end program management for inference platform initiatives spanning reliability, customer onboarding, launch readiness, and runtime optimization
  2. Lead cross-functional programs for customer onboarding across dedicated and serverless inference offerings, ensuring clear ownership, launch criteria, and readiness for strategic customer use cases
  3. Drive launch readiness for new inference capabilities by aligning teams around real customer outcomes, supportability, and end-to-end validation
  4. Partner with engineering and product to define and deliver roadmap outcomes for latency, throughput, uptime, operational quality, and price-performance
  5. Coordinate multi-team execution across platform, infrastructure, and customer-facing teams to deliver reliable and scalable inference services

Skills

Required

  • 8+ years of technical program management experience in distributed systems, cloud infrastructure, or AI/ML platform engineering
  • Proven experience driving large-scale infrastructure or platform programs from concept to production in complex, cross-functional environments
  • Strong technical fluency in distributed inference systems, GPU compute, cloud-native architectures, and performance optimization
  • Demonstrated success driving measurable improvements in reliability, performance, operational readiness, or customer delivery
  • Excellent written and verbal communication skills, with the ability to align engineering, product, infrastructure, and customer-facing stakeholders around shared goals
  • Experience with inference-serving systems, model onboarding workflows, rollout strategies, and observability tooling
  • Familiarity with launch readiness, supportability, incident follow-through, and release validation for production infrastructure or platform services
  • Understanding of customer onboarding for technical products, especially where platform capabilities, infrastructure readiness, and support processes must align for launch
  • Experience operating in high-growth environments where roadmap execution, reliability expectations, and customer commitments must be managed in parallel

Nice to have

  • Bachelor's degree in a technical field or equivalent practical experience

What the JD emphasized

  • inference platform delivery
  • customer onboarding
  • runtime optimization
  • launch readiness
  • reliability
  • performance
  • operational readiness
  • customer delivery
  • inference-serving systems
  • model onboarding workflows
  • rollout strategies
  • observability tooling
  • launch readiness
  • supportability
  • incident follow-through
  • release validation

Other signals

  • AI/ML Platform Services
  • inference platform delivery
  • customer onboarding
  • runtime optimization
  • scalable, reliable production inference services