Senior Software Engineer, GPU Performance

Google Google · Big Tech · Sunnyvale, CA +3

Senior Software Engineer focused on optimizing GPU performance for Google's AI models and products. This role involves low-level GPU programming, compiler optimizations, and influencing the GPU software stack and fleet deployment to ensure top performance for ML workloads.

What you'd actually do

  1. Build optimizations for the latest generation of GPUs that power Google’s most critical products and services, impacting billions of users worldwide.
  2. Address the most challenging performance bottlenecks through Google’s unparalleled access to the latest generation of GPUs, tooling, and a decade of experience building AI accelerators.
  3. Drive optimizations across Google’s GPU software stack from ML compiler cost model design to optimizing high performance GPU kernels to cross node model serving configurations.
  4. Influence the technical direction of the GPU software ecosystem at Google by collaborating with ML, compiler design, and systems architecture teams.
  5. Influence the deployment of Google’s GPU fleet by working with various product teams across Google.

Skills

Required

  • software development
  • software design and architecture
  • modern GPU architectures
  • memory hierarchies
  • performance bottlenecks
  • low-level GPU programming (CUDA, Triton, CUTLASS, etc.)
  • performance engineering techniques
  • modern LLMs and their deployment on AI accelerators

Nice to have

  • data structures and algorithms
  • technical leadership
  • compiler optimization
  • code generation
  • runtime systems for GPU architectures (OpenXLA, MLIR, Triton, etc.)

What the JD emphasized

  • latest generation of GPUs
  • performance bottlenecks
  • GPU software stack
  • high performance GPU kernels
  • model serving configurations
  • GPU software ecosystem
  • GPU fleet

Other signals

  • GPU performance optimization
  • ML infrastructure
  • LLM deployment