Senior Technical Program Manager, ML Infrastructure Acceleration

Google Google · Big Tech · Sunnyvale, CA +1

This role is for a Senior Technical Program Manager focused on ML Infrastructure Acceleration at Google. The individual will manage one of the world's largest Machine Learning fleets in the era of AI, ML, and Agents. Responsibilities include managing supply projections for GPU or TPU portfolios, communicating supply status and risks, evaluating acceleration options, and publishing system and tooling requirements to drive automation. The role requires strong program management skills, cross-functional collaboration, and executive-level communication, with a focus on scaling AI and infrastructure capabilities.

What you'd actually do

  1. Maintain and publish (weekly) Plan of Record (PoR) supply projection for one or more products in the GPU or TPU portfolio.
  2. Communicate supply status, risks, and blockers to leads across all key stakeholder teams and executives.
  3. Evaluate acceleration options, setting acceleration goals and driving progress to those goals, doing demand/supply trade-offs, risk assessments, etc. in partnership with delivery teams across Global Data Centers (GDC), Global Grid and Infrastructure (GGI), Chief Supply Chain Office (CSCO), Platform and Infrastructure Engineering (PIE), SRE.
  4. Publish system and tooling requirements to drive automation and scale in planning and execution.

Skills

Required

  • program management
  • technical field degree or equivalent practical experience
  • 8 years of experience in program management

Nice to have

  • managing cross-functional or cross-team projects
  • collaborating and influencing stakeholders across all levels and spanning multiple organizations
  • executive-level communication
  • customer engagement with passion for customer success
  • ability to shift between direct detailed analysis and big picture thinking and customize communication based on the audience

What the JD emphasized

  • manage one of the world’s largest Machine Learning fleets in the era of AI, ML, and Agents
  • regularly communicate with executive management
  • AI and Infrastructure team is redefining what’s possible
  • delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity
  • shaping the future of world-leading hyperscale computing
  • development of our TPUs
  • Vertex AI for Google Cloud
  • Google Global Networking
  • Data Center operations
  • systems research