Engineering Manager, ML Infrastructure at Google

What you'd actually do

Lead our new Workload Optimization (WO) team. Set the technical goal and roadmap and drive its key features in this pivotal role.

Collaborate closely with teams across machine learning (ML), and our product area customers to ensure successful execution.

Shape the team's culture and processes, identify new opportunities, and translate our broader strategy into concrete priorities and projects.

Coach and provide career guidance to your reports, improve our engineering practices, and influence technical direction across the organization.

Navigate open-ended issues and actively contribute to the team's engineering efforts as a technical lead.

Skills

Required

software development
developing infrastructure, distributed systems or networks
compute technologies, storage or hardware architecture
technical leadership role
people management or team leadership role

Nice to have

Master's degree or PhD in Computer Science, or a related technical field
working in a matrixed organization
end-to-end Machine Learning (ML) development lifecycle and infrastructure
communication and cross-team collaboration skills

Like Google's own ambitions, the work of a Software Engineer goes beyond just Search. Software Engineering Managers have not only the technical expertise to take on and provide technical leadership to major projects, but also manage a team of Engineers. You not only optimize your own code but make sure Engineers are able to optimize theirs. As a Software Engineering Manager you manage your project goals, contribute to product strategy and help develop your team. Teams work all across the company, in areas such as information retrieval, artificial intelligence, natural language processing, distributed computing, large-scale system design, networking, security, data compression, user interface design; the list goes on and is growing every day. Operating with scale and speed, our exceptional software engineers are just getting started -- and as a manager, you guide the way.

With technical and leadership expertise, you manage engineers across multiple teams and locations, a large product budget and oversee the deployment of large-scale projects across multiple sites internationally.

In this role, you will provide end-to-end, fleet-wide scheduling for all Alphabet Machine Learning (ML) workloads that are efficient, reliable, and easy-to-use. You will be responsible for scheduling work on almost all production machines.

The ML, Systems, and Cloud AI (MSCA) organization at Google designs, implements, and manages the hardware, software, machine learning, and systems infrastructure for all Google services (Search, YouTube, etc.) and Google Cloud. Our end users are Googlers, Cloud customers and the billions of people who use Google services around the world.

We prioritize security, efficiency, and reliability across everything we do - from developing our latest TPUs to running a global network, while driving towards shaping the future of hyperscale computing. Our global impact spans software and hardware, including Google Cloud’s Vertex AI, the leading AI platform for bringing Gemini models to enterprise customers.

The US base salary range for this full-time position is $207,000-$300,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

Lead our new Workload Optimization (WO) team. Set the technical goal and roadmap and drive its key features in this pivotal role.
Collaborate closely with teams across machine learning (ML), and our product area customers to ensure successful execution.
Shape the team's culture and processes, identify new opportunities, and translate our broader strategy into concrete priorities and projects.
Coach and provide career guidance to your reports, improve our engineering practices, and influence technical direction across the organization.
Navigate open-ended issues and actively contribute to the team's engineering efforts as a technical lead.

Qualifications

Minimum qualifications:

Bachelor’s degree or equivalent practical experience.
8 years of experience in software development.
3 years of experience with developing infrastructure, distributed systems or networks, or experience with compute technologies, storage or hardware architecture.
3 years of experience in a technical leadership role.
2 years of experience in a people management or team leadership role.

Preferred qualifications:

Master's degree or PhD in Computer Science, or a related technical field.
3 years of experience working in a matrixed organization.
Experience with the end-to-end Machine Learning (ML) development lifecycle and infrastructure.
Excellent communication and cross-team collaboration skills.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

Lead our new Workload Optimization (WO) team. Set the technical goal and roadmap and drive its key features in this pivotal role.
Collaborate closely with teams across machine learning (ML), and our product area customers to ensure successful execution.
Shape the team's culture and processes, identify new opportunities, and translate our broader strategy into concrete priorities and projects.
Coach and provide career guidance to your reports, improve our engineering practices, and influence technical direction across the organization.
Navigate open-ended issues and actively contribute to the team's engineering efforts as a technical lead.

Qualifications

Minimum qualifications:

Bachelor’s degree or equivalent practical experience.
8 years of experience in software development.
3 years of experience with developing infrastructure, distributed systems or networks, or experience with compute technologies, storage or hardware architecture.
3 years of experience in a technical leadership role.
2 years of experience in a people management or team leadership role.

Preferred qualifications:

Master's degree or PhD in Computer Science, or a related technical field.
3 years of experience working in a matrixed organization.
Experience with the end-to-end Machine Learning (ML) development lifecycle and infrastructure.
Excellent communication and cross-team collaboration skills.

Engineering Manager, ML Infrastructure

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications:

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications: