Technical Program Manager - Compute

Microsoft Microsoft · Big Tech · Mountain View, CA +4 · Technical Program Management

This role is for a Technical Program Manager focused on compute infrastructure for large-scale AI/ML training and serving of foundation models. The role involves driving projects, collaborating with various teams, leveraging data for optimization, and advocating for resource needs. It is part of Microsoft AI's Superintelligence Team, aiming to push the boundaries of AI while ensuring control and safety.

What you'd actually do

  1. Drive projects and programs related to compute infrastructure, including forecasting and allocation resource needs like compute, storage, network.
  2. Collaborate with product teams, engineers, researchers, and external partners to identify gaps and drive timelines toward resolution and mitigation.
  3. Leverage data and analytics to define metrics, set baselines and targets for fleet efficiency & optimize.
  4. Advocate for AI team’s resource needs with exec and working level partners across Microsoft.
  5. Own the status of key compute projects, proactively identifying risks and proposing solutions to ensure timely delivery.

Skills

Required

  • Bachelor's Degree AND 6+ years' experience in technical program management, infrastructure engineering, AI/ML, or product development OR equivalent experience.
  • 6+ years' experience managing cross-functional and/or cross-team projects.

Nice to have

  • Bachelor's Degree AND 10+ years' experience in technical program management, infrastructure engineering, AI/ML, or product development OR equivalent experience.
  • 10+ years' experience managing cross-functional and/or cross-team projects.

What the JD emphasized

  • Deeply understand the design, deployment, and optimization of large-scale compute infrastructure for AI/ML workloads.
  • managing high-stakes, time-sensitive, large-scale programs.
  • Advance the AI frontier responsibly.