Like Google's own ambitions, the work of a Software Engineer goes beyond just Search. Software Engineering Managers have not only the technical expertise to take on and provide technical leadership to major projects, but also manage a team of Engineers. You not only optimize your own code but make sure Engineers are able to optimize theirs. As a Software Engineering Manager you manage your project goals, contribute to product strategy and help develop your team. Teams work all across the company, in areas such as information retrieval, artificial intelligence, natural language processing, distributed computing, large-scale system design, networking, security, data compression, user interface design; the list goes on and is growing every day. Operating with scale and speed, our exceptional software engineers are just getting started -- and as a manager, you guide the way.
With technical and leadership expertise, you manage engineers across multiple teams and locations, a large product budget and oversee the deployment of large-scale projects across multiple sites internationally.
The Emergent AI infrastructure team in Google is looking to build the next generation of on-premises Artificial Intelligence (AI) infrastructure to bring the best of Google to empower Frontier model and AI solution builders to advance the state of the art for AI around the world.
In this role, you will fully integrate AI infrastructure systems, from hardware to software design and workload management. You will have experience building very large AI clusters using the latest technologies for AI acceleration and cluster interconnects and networking.Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s cutting-edge technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.
Responsibilities
- Set and communicate team priorities that support the broader organization's goals. Align strategy, processes, and decision-making across teams.
- Set clear expectations with individuals based on their level and role and aligned to the broader organization's goals. Meet regularly with individuals to discuss performance and development and provide feedback and coaching.
- Lead and support software engineers in AI infrastructure for Testbed Turn up and Operations including bare metal k8s infrastructure, identify and access control, observability, and capacity management.
- Design, guide and vet systems designs within the scope of the broader area, and write product or system development code to solve ambiguous problems.
- Review code developed by other engineers and provide feedback to ensure best practices. Oversee end-to-end operations, from coordinating network connectivity to bootstrapping control machines and deploying core Kubernetes infrastructure.
Qualifications
Minimum qualifications:
- Bachelor’s degree, or equivalent practical experience.
- 8 years of experience in software development.
- 3 years of experience with developing large-scale infrastructure, distributed systems or networks, or experience with compute technologies, storage or hardware architecture.
- 3 years of experience in a technical leadership role.
- 2 years of experience in a people management or team leadership role.
Preferred qualifications:
- Master's degree or PhD in Computer Science or a related technical field.
- 3 years of experience working in a complex, matrixed organization.
- Experience building and managing Kubernetes clusters at scale.
- Experience with cloud providers such as Google Cloud Platform (GCP), designing and building infrastructure platforms using compute, networking or storage technologies.
- Experience with deploying and maintaining hardware systems: servers, racks and networks.
- Proven experience working on large-scale infrastructure and integrations projects.