What you'd actually do

Design and create novel data tooling to accelerate Gemini model evaluation, training, and hill climbing to improve agentic capabilities.

Facilitate ingestion and creation of corpora representing complex worlds, and record human, agentic, and hybrid trajectories through the Reinforcement Learning (RL) environments.

Build scalable data collection pipelines bridging capturing multi-turn, tool-using agent interactions and enabling rapid iteration on environment complexity and reward design.

Create human-in-the-loop annotation and trajectory review tooling, analytics dashboards, and agentic orchestration frameworks to continuously generate, curate, and validate high-signal training corpora at scale.

Skills

Required

software development in one or more programming languages (e.g., Python, C, C++, Java, JavaScript)
systems design
product management
software engineering roles

Nice to have

Computer Science
Data Science
Reinforcement Learning (RL) environments
multi-turn, tool-using agent interactions
human-in-the-loop annotation
analytics dashboards

Other signals

building core infrastructure and tooling that powers Gemini's agentic capabilities

designing environments to record programmatic agent-API interactions

building computer-control systems to capture real-time human and model trajectories

tooling will deliver the foundational training and evaluation data that accelerates AI capabilities

design and create novel data tooling to accelerate Gemini model evaluation, training, and hill climbing to improve agentic capabilities

build scalable data collection pipelines bridging capturing multi-turn, tool-using agent interactions

create human-in-the-loop annotation and trajectory review tooling, analytics dashboards, and agentic orchestration frameworks to continuously generate, curate, and validate high-signal training corpora at scale

At Google DeepMind our mission is to build the world's first general-purpose learning agent. Central to this mission is the complex task of measuring the intelligence of our prototypes. As a Software Engineer, you will be working with the cutting edge AI agents developed by our exceptional team of Machine Learning and Neuroscience research scientists. Your responsibilities will include everything from creating systems for agent testing using 2D and 3D games to developing test problems within physics simulators. You will create graphical visualization of results, build competitive agent leaderboards and test new algorithms on robots. To succeed in this role you will need to have a strong foundation in software engineering and enjoy working on a wide range of challenging problems within a mission-driven team.

Shape the future of AI by building the core infrastructure and tooling that powers Gemini's agentic capabilities. Through advanced data curation and creation, you will drive the development of next-generation evaluation frameworks:

SmithBench: Our gold-standard benchmark testing whether AI agents can autonomously execute complex, end-to-end, first-party Google engineering workflows (such as CL lifecycles, bug investigations, and pipeline orchestration).

RE-Bench: Our benchmark designed to measure agent performance on highly complex, long-horizon research engineering tasks.

Whether you are designing environments to record programmatic agent-API interactions or building computer-control systems to capture real-time human and model trajectories, your tooling will deliver the foundational training and evaluation data that accelerates AI capabilities across models, harnesses, and skills.

Artificial intelligence will be one of humanity’s most transformative inventions. At Google DeepMind, we are a pioneering AI lab with exceptional interdisciplinary teams focused on advancing AI development to solve complex global challenges and accelerate high-quality product innovation for billions of users. We use our technologies for widespread public benefit and scientific discovery, ensuring safety and ethics are always our highest priority.

We are pushing the boundaries across multiple domains. Our global teams offer diverse learning opportunities and varied career pathways for those driven to achieve exceptional results through collective effort. Individual pay is determined by factors including job-related skills, experience, and relevant education or training.

US: $262000 - $365000 (USD) + 25% bonus target + bonus + equity + benefits

Learn more about benefits at Google.

Responsibilities

Design and create novel data tooling to accelerate Gemini model evaluation, training, and hill climbing to improve agentic capabilities.
Facilitate ingestion and creation of corpora representing complex worlds, and record human, agentic, and hybrid trajectories through the Reinforcement Learning (RL) environments.
Build scalable data collection pipelines bridging capturing multi-turn, tool-using agent interactions and enabling rapid iteration on environment complexity and reward design.
Create human-in-the-loop annotation and trajectory review tooling, analytics dashboards, and agentic orchestration frameworks to continuously generate, curate, and validate high-signal training corpora at scale.

Qualifications

Minimum qualifications:

Bachelor's degree in Computer Science, IT, a related field, or equivalent practical experience.
8 years of experience with software development in one or more programming languages (e.g., Python, C, C++, Java, JavaScript).

Preferred qualifications:

Bachelor's or Master's degree in Computer Science, Data Science, or a related field.
5 years of experience in systems design, product management, or software engineering roles.

RE-Bench: Our benchmark designed to measure agent performance on highly complex, long-horizon research engineering tasks.

US: $262000 - $365000 (USD) + 25% bonus target + bonus + equity + benefits

Learn more about benefits at Google.

Responsibilities

Design and create novel data tooling to accelerate Gemini model evaluation, training, and hill climbing to improve agentic capabilities.
Facilitate ingestion and creation of corpora representing complex worlds, and record human, agentic, and hybrid trajectories through the Reinforcement Learning (RL) environments.
Build scalable data collection pipelines bridging capturing multi-turn, tool-using agent interactions and enabling rapid iteration on environment complexity and reward design.
Create human-in-the-loop annotation and trajectory review tooling, analytics dashboards, and agentic orchestration frameworks to continuously generate, curate, and validate high-signal training corpora at scale.

Qualifications

Minimum qualifications:

Bachelor's degree in Computer Science, IT, a related field, or equivalent practical experience.
8 years of experience with software development in one or more programming languages (e.g., Python, C, C++, Java, JavaScript).

Preferred qualifications:

Bachelor's or Master's degree in Computer Science, Data Science, or a related field.
5 years of experience in systems design, product management, or software engineering roles.

Senior Staff Software Engineer, Agentic Data Tooling, Deepmind

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications:

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications: