Staff Software Engineer, Embedded Systems/firmware

Google Google · Big Tech · Sunnyvale, CA +1

Staff Software Engineer focused on embedded systems/firmware for Google's custom accelerators (TPUs, VCUs). The role involves leading design, development, and deployment of critical firmware features, bridging hardware and software teams, and providing technical direction. The work supports ML infrastructure capabilities and hyperscale computing.

What you'd actually do

  1. Lead design, development, and deployment of critical firmware features (security, power management, telemetry) across multiple TPU generations, from bare-metal microcontrollers to distributed systems.
  2. Bridge hardware design and software teams to co-engineer groundbreaking chip features and establish high-performance interfaces that unlock advanced ML infrastructure capabilities.
  3. Provide technical direction on high-impact projects while mentoring and influencing a distributed engineering team to foster a culture of technical excellence.
  4. Identify and mitigate complex system-level challenges through rigorous architectural design and testing to ensure the reliability of public TPU platforms.
  5. Own project priorities, deadlines, and deliverables while facilitating cross-team clarity to deliver an exceptional developer experience for platforms like Trillium and Ironwood.

Skills

Required

  • software development
  • testing
  • launching software products
  • embedded operating systems
  • software design and architecture
  • C/C++

Nice to have

  • data structures and algorithms
  • technical leadership role leading project teams and setting technical direction
  • complex, matrixed organization involving cross-functional, or cross-business projects
  • advanced computer architecture
  • industry-standard RTOSes (e.g., Zephyr, FreeRTOS)
  • robust API design
  • building complex systems within distributed engineering teams

What the JD emphasized

  • critical firmware features
  • groundbreaking chip features
  • high-performance interfaces
  • advanced ML infrastructure capabilities
  • complex system-level challenges
  • rigorous architectural design and testing
  • reliability of public TPU platforms