Staff Software Engineer, AI Infrastructure, Google Cloud, Applied AI

Google Google · Big Tech · Sunnyvale, CA +1

Staff Software Engineer focused on building and scaling high-performance, distributed infrastructure for agentic AI workflows in Google Cloud. The role involves transitioning experimental models into robust production services, ensuring low-latency and reliability for multi-agent systems, and iterating on customer needs for enterprise use cases.

What you'd actually do

  1. Architect and build high-performance, distributed infrastructure to support agentic AI workflows. Ensure low-latency agentic systems for real-world enterprise loads.
  2. Take full ownership of the tech stack, transitioning experimental models into robust production services. Ensure system reliability, observability, and fault tolerance in complex, multi-agent environments.
  3. Iterate rapidly on customer needs by identifying product gaps and implementing missing features. Translate vague user requirements into concrete, engineering-driven solutions that directly improve the user experience.
  4. Provide technical guidance on system architecture and code quality.
  5. Foster a culture of engineering excellence through design reviews, code audits, and the adoption of best practices.

Skills

Required

  • software development
  • C++
  • Artificial Intelligence
  • Distributed Systems
  • LLMs
  • High Performance Computing

Nice to have

  • Master’s degree or PhD in Computer Science, or a related field with a focus on Systems or AI
  • building and maintaining multi-agent systems or complex applications in an enterprise setting
  • evaluation frameworks for AI quality in production (e.g., A/B testing, shadow deployment, online learning)
  • Ability to navigate ambiguity and iterate quickly on customer feedback to deliver solutions that precisely meet market needs.

What the JD emphasized

  • agentic AI solutions
  • multi-agent systems
  • enterprise scale
  • customer feedback

Other signals

  • building agentic AI solutions
  • architecting scalable infrastructure for multi-agent systems
  • customer feedback
  • enterprise scale