What you'd actually do

Lead a team of software engineers focused on identifying and maintaining ML training and serving benchmarks that are representative to Google production and the broader ML industry.

Achieve performance for customer launches, and in case of third-party/open-source software (OSS) models, for engaged benchmark submissions (ML Commons, InferenceX, etc.).

Use benchmarks to identify performance opportunities and drive both near-term SOTA (e.g., custom kernels) and out-of the box performance (compiler/runtime optimizations, agentic tooling, auto-sharding) directly and in collaboration with partner teams.

Participate in algorithmic innovations exploiting new TPU hardware features and model-preserving optimizations (speculative decoding, sparsity, quantization, LoRA, etc.).

Participate in co-designing models that are TPU-friendly to showcase model quality at performance advanced to OSS models typically designed on GPUs.

Skills

Required

software development
leading ML design
optimizing ML infrastructure
technical leadership
people management
ML performance analysis
benchmarking
computer architecture

Nice to have

ML accelerators (GPUs, TPUs)
low-level kernel programming/tuning
CUDA
Triton
Pallas
compiler optimization
MLIR
OpenXLA
integrating frameworks/serving libraries
PyTorch
JAX
vLLM
hardware design

Other signals

TPU Performance team

maximizing the speed and efficiency of Google’s custom AI chips (TPUs) for training and running massive AI/ML models

optimization partners for both Google's internal teams and major external AI companies and foundation model builders

customer launches

ML Commons, InferenceX

compiler/runtime optimizations

agentic tooling

auto-sharding

speculative decoding, sparsity, quantization, LoRA

ML accelerators (GPUs, TPUs)

low-level kernel programming/tuning

compiler optimization (MLIR, OpenXLA)

integrating frameworks/serving libraries (PyTorch, JAX, vLLM)

Like Google's own ambitions, the work of a Software Engineer goes beyond just Search. Software Engineering Managers have not only the technical expertise to take on and provide technical leadership to major projects, but also manage a team of Engineers. You not only optimize your own code but make sure Engineers are able to optimize theirs. As a Software Engineering Manager you manage your project goals, contribute to product strategy and help develop your team. Teams work all across the company, in areas such as information retrieval, artificial intelligence, natural language processing, distributed computing, large-scale system design, networking, security, data compression, user interface design; the list goes on and is growing every day. Operating with scale and speed, our exceptional software engineers are just getting started -- and as a manager, you guide the way.

With technical and leadership expertise, you manage engineers across multiple teams and locations, a large product budget and oversee the deployment of large-scale projects across multiple sites internationally.

Google’s Core Machine Learning (ML) organization is looking for an Engineering Manager to join our pioneering TPU Performance team! Our team is responsible for maximizing the speed and efficiency of Google’s custom AI chips (TPUs) for training and running massive AI/ML models.

While we have a rich 10-year history of optimizing Google’s own internal AI models, our team is entering an exciting new phase. As Google expands its focus to become a major hardware provider for the broader tech industry, we are optimization partners for both Google's internal teams and major external AI companies and foundation model builders.

The AI and Infrastructure team is redefining what’s possible. We empower Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud customers, and billions of Google users worldwide.

We're the driving force behind Google's groundbreaking innovations, empowering the development of our cutting-edge AI models, delivering unparalleled computing power to global services, and providing the essential platforms that enable developers to build the future. From software to hardware our teams are shaping the future of world-leading hyperscale computing, with key teams working on the development of our TPUs, Vertex AI for Google Cloud, Google Global Networking, Data Center operations, systems research, and much more.

****Individual pay is determined by factors including job-related skills, experience, and relevant education or training.

US: $207000 - $301000 (USD) + 20% bonus target + equity + benefits

Learn more about benefits at Google.

Responsibilities

Lead a team of software engineers focused on identifying and maintaining ML training and serving benchmarks that are representative to Google production and the broader ML industry.
Achieve performance for customer launches, and in case of third-party/open-source software (OSS) models, for engaged benchmark submissions (ML Commons, InferenceX, etc.).
Use benchmarks to identify performance opportunities and drive both near-term SOTA (e.g., custom kernels) and out-of the box performance (compiler/runtime optimizations, agentic tooling, auto-sharding) directly and in collaboration with partner teams.
Participate in algorithmic innovations exploiting new TPU hardware features and model-preserving optimizations (speculative decoding, sparsity, quantization, LoRA, etc.).
Participate in co-designing models that are TPU-friendly to showcase model quality at performance advanced to OSS models typically designed on GPUs.

Qualifications

Minimum qualifications:

Bachelor’s degree or equivalent practical experience.
8 years of experience in software development.
5 years of experience leading ML design and optimizing ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning).
3 years of experience in a technical leadership role.
2 years of experience in a people management or team leadership role.
Experience with ML performance analysis, benchmarking, and computer architecture.

Preferred qualifications:

Master’s degree or PhD in Engineering, Computer Science, or a related technical field.
3 years of experience working in a complex, matrixed organization involving cross-functional, or cross-business projects.
Experience in ML accelerators (GPUs, TPUs) and low-level kernel programming/tuning using tools like CUDA, Triton, or Pallas.
Experience with compiler optimization (MLIR, OpenXLA) and integrating frameworks/serving libraries (PyTorch, JAX, vLLM) to maximize hardware efficiency.
Ability to adapt ML models to specific hardware strengths and use performance benchmarking to guide both optimization and future hardware design.

****Individual pay is determined by factors including job-related skills, experience, and relevant education or training.

US: $207000 - $301000 (USD) + 20% bonus target + equity + benefits

Learn more about benefits at Google.

Responsibilities

Lead a team of software engineers focused on identifying and maintaining ML training and serving benchmarks that are representative to Google production and the broader ML industry.
Achieve performance for customer launches, and in case of third-party/open-source software (OSS) models, for engaged benchmark submissions (ML Commons, InferenceX, etc.).
Use benchmarks to identify performance opportunities and drive both near-term SOTA (e.g., custom kernels) and out-of the box performance (compiler/runtime optimizations, agentic tooling, auto-sharding) directly and in collaboration with partner teams.
Participate in algorithmic innovations exploiting new TPU hardware features and model-preserving optimizations (speculative decoding, sparsity, quantization, LoRA, etc.).
Participate in co-designing models that are TPU-friendly to showcase model quality at performance advanced to OSS models typically designed on GPUs.

Qualifications

Minimum qualifications:

Bachelor’s degree or equivalent practical experience.
8 years of experience in software development.
5 years of experience leading ML design and optimizing ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning).
3 years of experience in a technical leadership role.
2 years of experience in a people management or team leadership role.
Experience with ML performance analysis, benchmarking, and computer architecture.

Preferred qualifications:

Master’s degree or PhD in Engineering, Computer Science, or a related technical field.
3 years of experience working in a complex, matrixed organization involving cross-functional, or cross-business projects.
Experience in ML accelerators (GPUs, TPUs) and low-level kernel programming/tuning using tools like CUDA, Triton, or Pallas.
Experience with compiler optimization (MLIR, OpenXLA) and integrating frameworks/serving libraries (PyTorch, JAX, vLLM) to maximize hardware efficiency.
Ability to adapt ML models to specific hardware strengths and use performance benchmarking to guide both optimization and future hardware design.

Engineering Manager, ML Performance

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications:

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications: