Technical Program Manager - Infrastructure Engineering

Netflix Netflix · Big Tech · United States · Remote · Engineering Operations

Netflix is hiring Technical Program Managers to lead critical initiatives across their Cloud Infrastructure and AI Platform. These roles will focus on migrations, adoption, and scalability, partnering closely with engineering and ML practitioners to drive high-impact, cross-functional programs. The TPMs will be responsible for decomposing complex problems, driving execution rigor, and influencing platform evolution.

What you'd actually do

  1. Lead initiatives that drive adoption and scalability of infrastructure platforms, including large-scale platform modernization, migrations, and feature rollouts.
  2. Partner closely with leaders, engineers, PMs, and practitioners to decompose complex problem spaces into technical execution plans with well-defined milestones, ownership, and success criteria.
  3. Drive execution rigor by tracking progress, managing dependencies, and proactively identifying risks, trade-offs, and mitigation strategies while implementing minimal process.
  4. Communicate clearly and consistently with technical leadership and stakeholders on progress, risks, and decisions.
  5. Use insights from program execution to influence platform evolution, architectural direction, and operational practices over time.

Skills

Required

  • 7+ years of experience leading large-scale technical programs in infrastructure, internal platforms, or distributed systems, working directly with engineering teams.
  • Experience driving programs across infrastructure layers to develop scalable platforms that support diverse business needs.
  • Proven experience creating partnerships with cross-functional teams, driving large-scale technical strategies, debating technical approaches, and building long-term scalable solutions with engineers.
  • Proven ability to identify gaps in solutions and weigh in on product vs. technology trade-offs.
  • Excellent written and verbal communication skills, with the ability to articulate technical constraints, trade-offs, and business impact to diverse audiences.
  • Self-starter who enjoys quickly bringing organization and direction.
  • Proven ability to operate in 0→1 and evolving problem spaces, quickly bringing order and execution discipline without adding too much process.
  • Understanding of challenges in high-scale distributed systems, architectures, and data layers.

Nice to have

  • Experience with fleet-wide cloud efficiency and/or building a culture of cost efficiency.
  • Experience with machine learning training and experimentation.
  • Experience with machine learning pipelines.
  • Experience with inference at production scale.
  • Experience applying AWS cloud services at a large scale.
  • Experience with Kubernetes used natively in platform services.
  • Experience with 3rd party vendors/tools for training and inference experimentation uses.

What the JD emphasized

  • large-scale technical programs in infrastructure
  • scalable platforms
  • cross-functional teams
  • large-scale technical strategies
  • long-term scalable solutions
  • high-scale distributed systems