Machine Learning Hardware Architect, Hardware, Software Co-design, Google Cloud

Google Google · Big Tech · Tel Aviv, Israel +1

This role focuses on architecting and defining the roadmap for AI/ML hardware acceleration, specifically TPUs, for Google Cloud. It involves co-design between model architecture and next-generation hardware, optimizing for ML serving and training capabilities, and integrating large-scale foundation models with advanced silicon architectures. The role requires defining technical roadmaps, architecting simulation frameworks, guiding system-level performance analysis, and managing cross-functional partnerships across hardware, compiler, and ML teams.

What you'd actually do

  1. Define and drive the technical roadmap and architecture for the hardware/software stack to ensure exceptional performance for ML models. Act as the technical liaison across research, software, and hardware teams, steering model architecture innovation to maximize scaling, quality, and hardware efficiency.
  2. Architect next-generation configurable simulation frameworks and performance models, setting the organizational standard for evaluating complex microarchitectural decisions. Drive high-stakes choices regarding Power, Performance, Area (PPA) and buildability for future chip and system architectures, expertly balancing long-term technological trends with strict product delivery timelines.
  3. Guide system-level performance analysis across highly distributed ML systems, innovating new methodologies to optimize and balance compute, memory bandwidth, and inter-chip network requirements. Their leadership will directly shape the future of high-performance AI infrastructure and hardware-software co-design.
  4. Manage cross-functional partnerships across hardware, compiler development and ML teams.

Skills

Required

  • Computer architecture
  • Chip architecture
  • Hardware-software co-design
  • C++
  • Python
  • Performance modeling
  • Simulation
  • System analysis

Nice to have

  • Master’s degree or PhD in Electrical Engineering, Computer Engineering, or Computer Science with an emphasis on computer architecture
  • Lead architect managing multi-generational hardware solutions or performance optimizations for massive-scale ML training and inference
  • Semiconductor technologies, industry trends, and the future trajectory of process, memory, interconnects, and packaging
  • Deep learning frameworks (e.g., TensorFlow, PyTorch)
  • Underlying execution models of deep learning frameworks

What the JD emphasized

  • exceptional performance for ML models
  • maximize scaling, quality, and hardware efficiency
  • exceptional performance for ML models
  • high-stakes choices
  • strict product delivery timelines
  • optimize and balance compute, memory bandwidth, and inter-chip network requirements
  • high-performance AI infrastructure
  • hardware-software co-design

Other signals

  • TPU development
  • AI/ML hardware acceleration
  • ML serving and training capabilities
  • foundation models
  • silicon architectures
  • high-performance accelerators