What you'd actually do

Build advanced classifiers and data pipelines to detect misuse, owning the end-to-end process from automated evaluation to rapid model iteration.

Build cross-context monitoring systems to detect coordinated harms, developing novel signal aggregation methods across disparate user sessions to identify large-scale attack vectors.

Implement data-driven, semi-automated account-level response systems to detect, track, and apply strikes against persistent malicious actors using rich signals from production traffic.

Evaluate and secure agentic AI systems by developing threat models, creating testing environments, and deploying robust mitigations against frontier-level agentic hacking and long-horizon attacks.

Be able to advance research in automated red-teaming and adversarial robustness, leveraging multi-turn/agentic attacks to systematically test for and uncover misuse vulnerabilities.

Skills

Required

software development
software design and architecture
research-to-deployment pipeline in a frontier AI environment
building classifiers
anomaly detection systems at scale
taking safety defenses or mitigations from research concepts to scalable production systems
applied ML projects
LLM training
inference
fine-tuning
AI coding agents
TPUs
JAX
AI control
chain-of-thought monitoring
faithfulness
monitorability
adversarial machine learning
automated red-teaming
model interpretability
probes

Nice to have

PhD in Computer Science or Machine Learning
publications at venues such as NeurIPS, ICLR, ICML, or EMNLP
cybersecurity detection and response
strong architectural judgment

What the JD emphasized

critical misuse domains

advanced mitigations

highly robust

tangibly dangerous model capabilities

proactively researching and implementing robust, defense-in-depth mitigations

frontier models

end-to-end defenses

Frontier Safety Framework commitments

safety mitigations

frontier AI environment

scalable production systems

frontier safety research

frontier-level agentic hacking

long-horizon attacks

Other signals

building novel evaluations

red-teaming

deploying advanced mitigations

monitoring emerging risks

building the next generation of safety mitigations

highly applied and focuses on building robust, end-to-end defenses

advancing research in automated red-teaming and adversarial robustness

leveraging multi-turn/agentic attacks

Our team focuses on de-risking model launches by defending against critical misuse domains such as Cybersecurity, CBRNE (Chemical, Biological, Radiological, Nuclear, Conventional Explosive), and Harmful Manipulation. Currently, this involves building novel evaluations, red-teaming, researching and deploying advanced mitigations (both in-model and out-of-model), and monitoring emerging risks. We ensure that our mitigations are highly robust, while still enabling the beneficial use of our technology.

GDM is a dedicated scientific community, committed to ‘solving intelligence’ and ensuring our technology is used for widespread public benefit.

The Frontier Safety Mitigation team operates in a fast-paced, highly collaborative environment. We have a strong culture of support, dedication, and teamwork. We take the possibility of tangibly dangerous model capabilities seriously as AI advances. Because of this, we believe that proactively researching and implementing robust, defense-in-depth mitigations is a critical part of the overall strategy for building safe AI.

We are looking for a research engineer for the Frontier Safety Mitigation team within the Gemini Safety team. In this role, you will help us build the next generation of safety mitigations for frontier models. This role is highly applied and focuses on building robust, end-to-end defenses against severe risks. This work feeds directly into DeepMind's Frontier Safety Framework commitments.

Artificial intelligence will be one of humanity’s most transformative inventions. At Google DeepMind, we are a pioneering AI lab with exceptional interdisciplinary teams focused on advancing AI development to solve complex global challenges and accelerate high-quality product innovation for billions of users. We use our technologies for widespread public benefit and scientific discovery, ensuring safety and ethics are always our highest priority.

We are pushing the boundaries across multiple domains. Our global teams offer diverse learning opportunities and varied career pathways for those driven to achieve exceptional results through collective effort.

The US base salary range for this full-time position is $174,000-$252,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

Build advanced classifiers and data pipelines to detect misuse, owning the end-to-end process from automated evaluation to rapid model iteration.
Build cross-context monitoring systems to detect coordinated harms, developing novel signal aggregation methods across disparate user sessions to identify large-scale attack vectors.
Implement data-driven, semi-automated account-level response systems to detect, track, and apply strikes against persistent malicious actors using rich signals from production traffic.
Evaluate and secure agentic AI systems by developing threat models, creating testing environments, and deploying robust mitigations against frontier-level agentic hacking and long-horizon attacks.
Be able to advance research in automated red-teaming and adversarial robustness, leveraging multi-turn/agentic attacks to systematically test for and uncover misuse vulnerabilities.

Qualifications

Minimum qualifications:

Bachelor’s degree or equivalent practical experience.
5 years of experience with software development in one or more programming languages.
3 years of experience testing, maintaining, or launching software products, and 1 year of experience with software design and architecture.
Experience working across the research-to-deployment pipeline in a frontier AI environment.

Preferred qualifications:

PhD in Computer Science or Machine Learning, or publications at venues such as NeurIPS, ICLR, ICML, or EMNLP.
Experience with cybersecurity detection and response, building classifiers and anomaly detection systems at scale, taking safety defenses or mitigations from research concepts to scalable production systems.
Experience collaborating on or leading applied ML projects, including LLM training, inference, and fine-tuning.
Experience using AI coding agents with strong architectural judgment and with TPUs and JAX.
Knowledge of AI control, chain-of-thought monitoring, faithfulness, monitorability, and related frontier safety research.
Background in adversarial machine learning, automated red-teaming, or model interpretability and probes.