Senior Software Engineer, Tpu Performance, Hardware, Software Co-design

Google Google · Big Tech · Sunnyvale, CA +2

This role focuses on the co-design and optimization of ML systems with hardware (HW) and software (SW), specifically for TPUs. The engineer will analyze and improve the performance, power, and energy efficiency of ML workloads, including large-language models and large embedding models, by optimizing model architecture, software systems, and hardware architecture. The role involves exploring and defining future ML accelerator architectures.

What you'd actually do

  1. Analyze performance, power, and energy efficiency of current and future ML workloads to identify issues.
  2. Enable the peak efficiency of future and current ML systems through full-stack ML hardware-software co-design by proposing HW-aware algorithm optimization and related simulation modeling.
  3. Establish a deep understanding of the latest business-critical production ML models (e.g., large-language models, large embedding models) to inform optimizations of model architecture, software systems, and hardware architecture.
  4. Explore and define future ML accelerator system and chip architectures with objective and data-driven insights.

Skills

Required

  • software development in one or more programming languages
  • coding experience in one or more of the following languages: C, C++, Java, or Python
  • testing, maintaining, or launching software products

Nice to have

  • data structures and algorithms
  • ML algorithm and performance analysis and optimization
  • architecture simulator development and microarchitecture
  • computer architecture such as TPUs or other accelerators
  • LLMs and ML frameworks and compilers
  • communication skills

What the JD emphasized

  • ML systems with hardware (HW) and software (SW) co-design and optimization
  • ML workloads
  • large-language models
  • large embedding models
  • ML accelerator system and chip architectures

Other signals

  • ML systems with hardware (HW) and software (SW) co-design and optimization
  • TPUs
  • ML workloads
  • large-language models
  • large embedding models
  • ML accelerator system and chip architectures