Technical Program Manager, Iaas

Weights & Biases Weights & Biases · Data AI · Sunnyvale, CA · Technology

The role is for a Technical Program Manager (TPM) focused on CoreWeave's Infrastructure as a Service (IaaS) for their CPU Compute platform, which complements GPU acceleration in AI clusters. The TPM will lead cross-functional programs to convert product strategy into scalable, reliable, and observable infrastructure, focusing on performance and scalability initiatives for high-throughput CPU clusters in AI environments. This involves managing programs, defining scope, partnering with engineering and product teams, and driving process improvements for complex, high-throughput platforms. While the role operates within an AI-focused company and supports AI infrastructure, the core responsibilities are in program management for cloud infrastructure, not direct AI/ML model development or research.

What you'd actually do

  1. Lead and drive execution for end-to-end technical program management for critical cloud infrastructure initiatives, including planning, execution, delivery, and retrospectives.
  2. Define program scope, milestones, and success metrics while managing risks and dependencies.
  3. Partner closely with engineering, product management, operations, and security teams to ensure alignment on priorities and deliverables.
  4. Act as the primary point of contact for stakeholders, providing regular status updates, addressing risks, and ensuring accountability.
  5. Facilitate and influence technical discussions and decisions to align with long-term infrastructure goals and business objectives.

Skills

Required

  • Bachelor’s degrees in a technical field or equivalent experience.
  • 8+ years of experience managing large-scale, complex programs in a fast-paced, technology-driven environment.
  • Strong understanding of cloud computing concepts, infrastructure as a service.
  • Exceptional leadership, interpersonal, and influencing skills with a proven ability to build relationships across technical and non-technical teams.
  • Excellent written and verbal communication skills, with the ability to convey complex technical concepts to diverse audiences.
  • Proficiency in JIRA, Confluence and other program management tools
  • Strong analytical, critical thinking and problem-solving skills with a focus on delivering results.
  • Proven track record in program management, process definition and improvements and influencing adoption of defined processes across multiple teams or organizations.
  • Ability to lead and influence cross-functional teams to prioritize, manage tradeoffs, identify gaps and risks, drive accountability, and measure successes.
  • Comfortable in handling ambiguity and drive clarity.Experience operating autonomously across multiple teams and organizations.

Nice to have

  • Master's or advanced technical degree.
  • Familiarity with networking, storage, containerization (Kubernetes), and infrastructure.
  • Comfortable working in a fast moving environment and are flexible working with a variety of leadership.
  • Experience in driving and leading Devops initiatives, collaborating with cross-functional teams to improve development processes, deployment pipelines, and system reliability.
  • Experience in incident management and root cause analysis(RCA).
  • Experience building cloud infrastructure and applications.

What the JD emphasized

  • 8+ years of experience managing large-scale, complex programs in a fast-paced, technology-driven environment.
  • Proven track record in program management, process definition and improvements and influencing adoption of defined processes across multiple teams or organizations.
  • Ability to lead and influence cross-functional teams to prioritize, manage tradeoffs, identify gaps and risks, drive accountability, and measure successes.