Lead Software Engineer, Infrastructure Quality, Robotics, Deepmind

Google Google · Big Tech · Mountain View, CA +1

Lead Software Engineer focused on quality and engineering productivity for robotics AI, responsible for scaling software releases, AI model validation, testing environments, and system reliability to accelerate AGI development in the physical world.

What you'd actually do

  1. Design, implement, and own the long-term roadmap and execution for large scale software and HWITL testing environments, conventional and on-robot testing, as well as manual and automated.
  2. Design, deploy, and manage a robust software release discipline, with flexible powerful staging environments, and comprehensive regression checks for software, AI models, and robot hardware over time.
  3. Define productivity metrics such as code in production, bottlenecks, code debt, reducing, and improving code health, and optimize the data/ML Flywheel, from ingestion and labeling to model evaluation and deployment.
  4. Lead application of agentic AI, including automating root-cause analysis of HWITL failures, edge-case scenarios, or data quality or AI model performance problems.
  5. Drive end-to-end system reliability for software systems and robot fleets. Partner with external hardware vendors and internal teams to co-develop, integrate, and test joint software and hardware releases.

Skills

Required

  • programming in a general purpose coding language (e.g., C, C++, Java, JavaScript, or Python)
  • people management, supervision/team leadership role
  • software quality and release strategy on custom or experimental hardware
  • design, implement, and own the long-term roadmap and execution for large scale software and HWITL testing environments
  • design, deploy, and manage a robust software release discipline
  • define productivity metrics
  • optimize the data/ML Flywheel
  • application of agentic AI
  • drive end-to-end system reliability for software systems and robot fleets

Nice to have

  • Master's degree or PhD in Computer Science, Artificial Intelligence, Machine Learning, or related technical fields
  • systems development engineering
  • system reliability engineering
  • software engineering with a focus on operations
  • advanced systems administration

What the JD emphasized

  • software quality and AI model validation
  • Artificial General Intelligence (AGI)
  • testing, reliability, validation, release strategy, and overall engineering productivity and velocity
  • non-deterministic hardware wear-and-tear, complex large-scale model deployment, fast inference, and the massive data requirements of AGI
  • software quality and release strategy on custom or experimental hardware
  • high quality bar
  • data/ML Flywheel, from ingestion and labeling to model evaluation and deployment
  • agentic AI
  • HWITL failures, edge-case scenarios, or data quality or AI model performance problems
  • end-to-end system reliability for software systems and robot fleets

Other signals

  • scaling software quality and AI model validation
  • influencing the velocity at which our team solves Artificial General Intelligence (AGI) in the physical world
  • building and upscaling our testing, reliability, validation, release strategy, and overall engineering productivity and velocity
  • solving for non-deterministic hardware wear-and-tear, complex large-scale model deployment, fast inference, and the massive data requirements of AGI
  • experienced with software quality and release strategy on custom or experimental hardware
  • work will directly determine how many iterations our engineers and scientists can run per day, week, and year, while holding a high quality bar
  • optimizing the data/ML Flywheel, from ingestion and labeling to model evaluation and deployment
  • automating root-cause analysis of HWITL failures, edge-case scenarios, or data quality or AI model performance problems
  • drive end-to-end system reliability for software systems and robot fleets