What you'd actually do

Establish and lead effective program teams to ensure alignment and achieve common objectives

Work closely with engineering, data center, hardware and business stakeholders to define program requirements, prioritize initiatives, and establish scope, including shaping the roadmap and long-term strategy for partner teams

Create and implement communication strategies to proactively share program status, challenges, and risks with stakeholders

Drive successful outcomes by actively managing cross-functional dependencies, mitigating risks, and adjusting scope, timeline, and resources as needed

Collaborate with cross-functional teams to lead the end-to-end lifecycle of programs, including technical analysis, design, development, testing, implementation, and post-launch support

Skills

Required

12+ years of experience in software engineering, hardware engineering, systems engineering, or technical product/program management
Knowledge of software and hardware development for large scale hardware readiness, including end-to-end product development processes
Experience delivering complex technology programs and products from inception through to successful delivery
Experience defining and optimizing engineering processes at scale
Experience building work relationships across multi-disciplinary teams and with partners in different time zones
Knowledge of Large Language Model and machine learning, and scaling distributed systems
Proven commitment to scale infrastructure for large scale AI distributed compute systems

Nice to have

Demonstrated ongoing AI skill development (e.g., prompt/context engineering, agent orchestration) and staying current with emerging AI technologies
Experience adhering to and implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy reviews)
Demonstrated ability to integrate AI tools to optimize/redesign workflows and drive measurable impact (e.g., efficiency gains, quality improvements)

What the JD emphasized

lead complex, large-scale projects focused on advancing language model scaling

driving the end-to-end integration of new AI hardware and core infra stack

scale foundational hardware, software systems, and tools that support Meta’s AI innovation

scale infrastructure for large scale AI distributed compute systems

Meta’s Core Infrastructure team seeks a Technical Program Manager (TPM) to lead complex, large-scale projects focused on advancing language model scaling. In this key position, you will collaborate across engineering, hardware, data center, research, and product teams to design, build, and scale foundational hardware, software systems, and tools that support Meta’s AI innovation. You will be responsible for driving the end-to-end integration of new AI hardware and core infra stack, from initial design validation of our software stack through production deployment. This includes developing and refining repeatable frameworks for efficient onboarding, ensuring robust and predictable execution, and proactively resolving technical and organizational challenges to maintain project momentum. You will use your problem-solving, technical acumen, and business insight to streamline onboarding of new AI hardware platforms into Meta’s suite of core infrastructure services. You will communicate transparently across all levels, motivate multidisciplinary teams, and champion best practices to deliver impactful outcomes that advance Meta’s infrastructure.

Responsibilities

Establish and lead effective program teams to ensure alignment and achieve common objectives Work closely with engineering, data center, hardware and business stakeholders to define program requirements, prioritize initiatives, and establish scope, including shaping the roadmap and long-term strategy for partner teams Create and implement communication strategies to proactively share program status, challenges, and risks with stakeholders Drive successful outcomes by actively managing cross-functional dependencies, mitigating risks, and adjusting scope, timeline, and resources as needed Collaborate with cross-functional teams to lead the end-to-end lifecycle of programs, including technical analysis, design, development, testing, implementation, and post-launch support Establish and track key metrics, quality benchmarks, and performance indicators to drive accountability and ensure effective cross-functional execution of program deliverables Anticipate and evaluate complex, long-term infrastructure challenges in close partnership with engineering leaders and key stakeholders Drive product strategy to support and align with key company initiatives Lead process improvements across internal and external teams, streamlining workflows and reducing manual effort through automation

Qualifications

Bachelor of Science in Electrical Engineering, Computer Science, Mechanical Engineering, or a related technical field, or equivalent experience 12+ years of experience in software engineering, hardware engineering, systems engineering, or technical product/program management Knowledge of software and hardware development for large scale hardware readiness, including end-to-end product development processes Excel at clearly communicating complex technical investments in a simple and understandable manner Experience delivering complex technology programs and products from inception through to successful delivery Knowledge of understanding user needs, gathering requirements, and defining project scope Experience working under your own initiative, across multiple teams, demonstrating critical thinking and providing thought leadership in ambiguous spaces Experience defining and optimizing engineering processes at scale Excel at building cross-functional relationships, thrive amid complex challenges, excel at clearly communicating complex technical investments in a simple and understandable manner Experience analyzing and solving complex technical problems in large-scale systems (e.g., root cause analysis, capacity planning, system design trade-offs) Experience building work relationships across multi-disciplinary teams and with partners in different time zones Experience defining strategic direction and identifying new opportunities for impact across products, platforms, and programs Experience communicating at the executive level and influencing leadership and technical management teams to drive the development of systems, solutions, and products Knowledge of Large Language Model and machine learning, and scaling distributed systems Demonstrated experience of identifying new opportunities for the larger organization and influencing the appropriate stakeholders Proven commitment to scale infrastructure for large scale AI distributed compute systems Experience communicating complex technical investments in a clear and understandable manner to executive and cross-functional stakeholders Demonstrated ongoing AI skill development (e.g., prompt/context engineering, agent orchestration) and staying current with emerging AI technologies Knowledge of software and hardware development for large scale system readiness Experience adhering to and implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy reviews) Demonstrated ability to integrate AI tools to optimize/redesign workflows and drive measurable impact (e.g., efficiency gains, quality improvements)

Technical Program Manager, Core Infrastructure

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Responsibilities

Qualifications

Responsibilities

Qualifications