Infrastructure Operations Program Manager

Weights & Biases Weights & Biases · Data AI · Bellevue, WA +4 · Global Field Organization

This role is for an Infrastructure Operations Program Manager at CoreWeave, a cloud provider focused on AI workloads. The role involves operationalizing and scaling the company's bare-metal support program, focusing on data analysis and reporting to improve client experience and operational processes. Responsibilities include owning data processes and tooling, driving operational insights through data, partnering cross-functionally, building data models and reporting layers, transforming ad hoc processes into standardized workflows, developing key metrics and dashboards, and enabling a metrics-driven operating model. The ideal candidate has program or operations management experience in cloud infrastructure or datacenter operations, with strong data analysis and reporting skills, and familiarity with support or service delivery environments and cloud computing concepts.

What you'd actually do

  1. Own and evolve data processes and tooling across all support workflows, including break/fix, incident response, maintenance, RMA, and customer escalations.
  2. Drive operational insights through data, improving how we collect, structure, and analyze information to surface trends, enforce accountability, reduce repeat issues, and proactively report on program performance.
  3. Partner cross-functionally with engineering, operations, vendors, and customers to align on data definitions, reporting standards, and workflow instrumentation.
  4. Partner with data teams to build and maintain scalable data models and reporting layers, working with both structured and unstructured data to create reliable tables, views, and dashboards.
  5. Transform ad hoc processes into standardized workflows, enabling consistent tracking and measurement across regions, vendors, and customers.

Skills

Required

  • 3+ years of program or operations management experience in cloud infrastructure, datacenter operations, or high-performance computing environments.
  • Strong ability to bridge operations and data, with hands-on experience in support environments, data analysis, reporting, and visualization across multiple systems and sources.
  • Experience working with support or service delivery environments (e.g., ticketing systems, incident management, escalation workflows).
  • Proficiency in working with structured and unstructured data, including building datasets, defining metrics, and ensuring data quality and consistency.
  • Demonstrated ability to design and implement scalable processes and workflows, turning ambiguity into repeatable, measurable systems.
  • Familiarity with cloud computing concepts, containerization, and/or infrastructure hardware environments.
  • Excellent communication and stakeholder management skills, with the ability to work effectively across technical and non-technical teams, including external partners.
  • Proven problem-solving mindset, with the ability to operate effectively in fast-paced, evolving environments.
  • Experience creating clear, customer-facing reports and documentation that translate operational data into actionable insights.

Nice to have

  • Working knowledge of hardware troubleshooting and datacenter operations is a strong plus.
  • Curious about Kubernetes, Docker, and containerized infrastructure
  • Strong problem-solving skills with a proactive and analytical mindset.
  • Excellent communication skills and a demonstrated ability to work collaboratively in a fast-paced environment.

What the JD emphasized

  • operationalizing and scaling CoreWeave’s bare-metal support program globally
  • data program
  • data-first management
  • data models and reporting layers
  • structured and unstructured data
  • standardized workflows
  • metrics and dashboards
  • metrics-driven operating model
  • tooling, reporting, and workflows