Senior Staff Software Engineer, Agentic Data Tooling, Deepmind

Google Google · Big Tech · New York, NY +3

Senior Staff Software Engineer focused on building agentic data tooling for Gemini, including evaluation frameworks (SmithBench, RE-Bench), data collection pipelines for agent interactions, and human-in-the-loop annotation systems to accelerate AI capabilities and agent development.

What you'd actually do

  1. Design and create novel data tooling to accelerate Gemini model evaluation, training, and hill climbing to improve agentic capabilities.
  2. Facilitate ingestion and creation of corpora representing complex worlds, and record human, agentic, and hybrid trajectories through the Reinforcement Learning (RL) environments.
  3. Build scalable data collection pipelines bridging capturing multi-turn, tool-using agent interactions and enabling rapid iteration on environment complexity and reward design.
  4. Create human-in-the-loop annotation and trajectory review tooling, analytics dashboards, and agentic orchestration frameworks to continuously generate, curate, and validate high-signal training corpora at scale.

Skills

Required

  • software development in one or more programming languages (e.g., Python, C, C++, Java, JavaScript)
  • systems design
  • product management
  • software engineering roles

Nice to have

  • Computer Science
  • Data Science
  • Reinforcement Learning (RL) environments
  • multi-turn, tool-using agent interactions
  • human-in-the-loop annotation
  • analytics dashboards

What the JD emphasized

  • building the core infrastructure and tooling that powers Gemini's agentic capabilities
  • next-generation evaluation frameworks
  • agent testing using 2D and 3D games
  • developing test problems within physics simulators
  • human-in-the-loop annotation and trajectory review tooling
  • agentic orchestration frameworks

Other signals

  • building core infrastructure and tooling that powers Gemini's agentic capabilities
  • designing environments to record programmatic agent-API interactions
  • building computer-control systems to capture real-time human and model trajectories
  • tooling will deliver the foundational training and evaluation data that accelerates AI capabilities
  • design and create novel data tooling to accelerate Gemini model evaluation, training, and hill climbing to improve agentic capabilities
  • build scalable data collection pipelines bridging capturing multi-turn, tool-using agent interactions
  • create human-in-the-loop annotation and trajectory review tooling, analytics dashboards, and agentic orchestration frameworks to continuously generate, curate, and validate high-signal training corpora at scale