Vice President, Software Engineering, Data Center Infrastructure

Google Google · Big Tech · Sunnyvale, CA +1

This role is for a Vice President of Software Engineering focused on Data Center Infrastructure at Google. The primary responsibility is to lead global software engineering teams in developing, testing, and deploying products and hardware that support Google's Platform Infrastructure. While the role operates within an organization that supports AI services like Vertex AI and Gemini models, and mentions ML Training and Inference, the core focus is on the underlying infrastructure and not on the direct development or research of AI models themselves. The role involves strategic planning, leading engineering organizations, driving software quality, managing scalable systems, and ensuring fleet health, with a strong emphasis on infrastructure for cloud services.

What you'd actually do

  1. Develop a strategic outlook for the Google Data Center Infrastructure organization to constantly innovate and implement novel solutions for next-generation cloud software and services.
  2. Lead a large Engineering organization in all areas of compute infrastructure, deployment and testing, including enhancements to TI/Cloud/Storage/GPU/TPU based servers, ML Training, ML Inference, and YouTube transcoding.
  3. Drive the software quality metrics, focusing on continuous improvement.
  4. Manage robust and scalable systems in order to get ahead of the exponential market demand.
  5. Manage the health of production fleet and partner with internal supply chain business teams and external supply chain partners (e.g., suppliers, CMs, and 3PLs) in architecting and leading the the planning process for PIE.

Skills

Required

  • software development
  • design
  • architecture
  • data structures
  • logarithms
  • testing
  • QA engineering delivery
  • building and developing large-scale infrastructure
  • distributed systems
  • networks
  • compute technologies
  • storage
  • hardware architecture

Nice to have

  • PhD degree
  • building software and large-scale distributed systems
  • delivering within time-sensitive timelines
  • understanding of private and public cloud design considerations and limitations in the areas of virtualization, global infrastructure, distributed systems, load balancing, networking, large-scale data storage, and security
  • work with globally distributed, cross-functional teams, partnering with groups (e.g., Supply Chain Operations, Sales, Engineering, Product Management, Product Marketing, UX and UI), brokering trade offs with stakeholders and understand their needs

What the JD emphasized

  • next-generation cloud software and services
  • ML Training
  • ML Inference
  • robust and scalable systems
  • exponential market demand