Research Scientist, Agent Post Training, Deepmind

Google Google · Big Tech · Mountain View, CA +1

Research Scientist at Google DeepMind focused on agent post-training for the Gemini model. The role involves driving research, designing experiments, developing infrastructure for data analysis and model evaluation, and collaborating with other researchers. Requires a PhD or equivalent experience in ML/CS, and experience with ML frameworks and research in RL, tool use, or agentic systems. Preferred qualifications include publishing experience in RL/RLHF/tool-use, large-scale distributed training, and systems design.

What you'd actually do

  1. Drive the research process for large-scale agent post-training from hypothesis formulation to delivery in the Gemini model recipe.
  2. Design and execute ablation studies to validate research hypotheses and accelerate experimental feedback loops.
  3. Communicate research findings, progress, and outcomes to the broader team through visualizations and reports.
  4. Develop research infrastructure and utilities for data analysis and model evaluations using standard engineering practices.
  5. Collaborate with other research scientists and engineers to maintain a regular feedback and communication loop.

Skills

Required

  • PhD in Computer Science, Machine Learning, or a related quantitative field, or equivalent practical experience
  • 2 years of experience with machine learning frameworks such as JAX, Flax, or PyTorch
  • Experience conducting research in reinforcement learning, tool use, or agentic systems.

Nice to have

  • Experience publishing research in reinforcement learning, reinforcement learning from human feedback, or tool-use algorithms at machine learning venues.
  • Experience working with large-scale distributed training infrastructure and scaling experiments.
  • Research experience in systems design, code complexity, or working with large-codebase environments.
  • Experience developing simple, scalable solutions for complex, open-ended research problems.

What the JD emphasized

  • agent post-training
  • reinforcement learning
  • tool use
  • agentic systems
  • reinforcement learning
  • reinforcement learning from human feedback
  • tool-use algorithms

Other signals

  • agent post-training
  • Gemini model recipe
  • reinforcement learning
  • tool use
  • agentic systems