Senior Software Engineer, Tpu Supercomputer, Infrastructure, Cloud

Google Google · Big Tech · Sunnyvale, CA +1

This role is for a Senior Software Engineer focused on the infrastructure for TPU supercomputers within Google Cloud. The responsibilities include designing and maintaining software across various layers of the stack, implementing network routing rules, developing control software for distributed hardware, building distributed solutions for monitoring and controlling TPU accelerators, and overseeing the full lifecycle of supercomputing systems. The role requires significant experience in software development, infrastructure, distributed systems, and hardware architecture.

What you'd actually do

  1. Design and maintain TPU supercomputer software across various layers of the stack, including host-side daemons and hardware-level interfaces.
  2. Implement network routing rules directly within TPU hardware to facilitate efficient data movement across supercomputing systems.
  3. Develop control software for specialized machines that manage and orchestrate distributed collections of networked hardware.
  4. Build distributed software solutions on Google’s internal and cloud infrastructure to monitor, deploy, and control TPU accelerators at scale.
  5. Oversee the full lifecycle of supercomputing systems, including the qualification and servicing of hardware and software components.

Skills

Required

  • software development
  • large-scale infrastructure
  • distributed systems
  • networks
  • compute technologies
  • storage
  • hardware architecture
  • software design
  • software architecture
  • testing
  • maintaining software products
  • launching software products

Nice to have

  • C++
  • Go
  • SQL
  • data structures
  • algorithms
  • technical leadership

What the JD emphasized

  • 5 years of experience with software development in one or more programming languages
  • 3 years of experience testing, maintaining, or launching software products, and 1 year of experience with software design and architecture
  • 3 years of experience with developing large-scale infrastructure, distributed systems or networks, or experience with compute technologies, storage or hardware architecture