Technical Program Manager - Performance & Benchmarking

Weights & Biases Weights & Biases · Data AI · Bellevue, WA +4 · Technology

This role is for a Technical Program Manager focused on Performance & Benchmarking within an AI/ML Platform Services organization. The TPM will drive end-to-end program execution for initiatives related to infrastructure validation, performance testing, benchmark execution, and observability, ensuring CoreWeave's infrastructure is performant and stable for AI workloads. The role involves partnering with engineering, infrastructure, product, and go-to-market teams to improve workload performance, validate new environments, and create visibility into system performance.

What you'd actually do

  1. Drive end-to-end program execution for performance and benchmarking initiatives spanning infrastructure validation, performance testing, benchmark execution, observability, and launch readiness
  2. Partner with engineering and infrastructure teams to deliver programs that verify new hardware platforms, clusters, and software environments meet CoreWeave standards for performance and stability
  3. Lead cross-functional efforts to operationalize benchmarking frameworks that measure model performance, runtime efficiency, GPU utilization, and workload reliability across environments
  4. Coordinate dependencies across platform engineering, infrastructure, capacity, product, and go-to-market teams to ensure performance findings are translated into roadmap priorities, customer readiness, and external proof points
  5. Build program mechanisms for release readiness, benchmark planning, risk management, issue escalation, and post-launch review for performance-sensitive infrastructure initiatives

Skills

Required

  • 5+ years of technical program management experience in cloud infrastructure, distributed systems, high-performance computing, or AI/ML platforms
  • Experience leading large-scale cross-functional programs involving performance engineering, benchmarking, validation systems, or infrastructure readiness
  • Strong technical fluency in distributed systems, GPU or accelerator-based infrastructure, workload performance measurement, and large-scale infrastructure operations
  • Demonstrated ability to define program metrics and drive measurable outcomes in performance, reliability, scale, or operational maturity
  • Excellent communication skills, with experience influencing engineering, product, and infrastructure stakeholders
  • Experience with AI/ML benchmarking, performance analysis, or infrastructure validation for training and inference workloads
  • Familiarity with GPU cluster architecture, workload observability, hardware bring-up, and performance bottleneck analysis
  • Understanding of benchmarking methodologies, reproducibility, test coverage, and the tradeoffs between performance, stability, utilization, and customer readiness
  • Experience building launch processes, release governance, dependency management, and operational review mechanisms in fast-scaling environments

Nice to have

  • Bachelor's degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience

What the JD emphasized

  • performance and benchmarking
  • infrastructure validation
  • performance readiness
  • performance testing
  • benchmark execution
  • observability
  • performance
  • benchmarking
  • performance analysis
  • performance bottleneck analysis

Other signals

  • performance benchmarking
  • infrastructure validation
  • AI/ML workloads
  • GPU infrastructure
  • observability