Research Engineer, Frontier Safety Mitigations, Deepmind

Google Google · Big Tech · London, United Kingdom

Research Engineer focused on frontier AI safety mitigations, defending against misuse domains like CBRNE and Harmful Manipulation. Responsibilities include building evaluations, red-teaming, deploying in-model and out-of-model mitigations, and monitoring risks for frontier models, particularly agentic AI systems. The role involves developing classifiers, monitoring systems, and advancing research in automated red-teaming and adversarial robustness.

What you'd actually do

  1. Build advanced classifiers and data pipelines to detect misuse, owning the end-to-end process from automated evaluation to rapid model iteration.
  2. Build cross-context monitoring systems to detect coordinated harms, developing novel signal aggregation methods across disparate user sessions to identify large-scale attack vectors.
  3. Implement data-driven, semi-automated account-level response systems to detect, track, and apply strikes against persistent malicious actors using rich signals from production traffic.
  4. Evaluate and secure agentic AI systems by developing threat models, creating testing environments, and deploying robust mitigations against frontier-level agentic hacking and long-horizon attacks.
  5. Be able to advance research in automated red-teaming and adversarial robustness, leveraging multi-turn/agentic attacks to systematically test for and uncover misuse vulnerabilities.

Skills

Required

  • Bachelor’s degree or equivalent practical experience.
  • 5 years of experience with software development in one or more programming languages.
  • 3 years of experience testing, maintaining, or launching software products, and 1 year of experience with software design and architecture.

Nice to have

  • PhD in Computer Science, Machine Learning, or equivalent practical experience, or publications at venues (e.g., NeurIPS, ICLR, ICML, or EMNLP).
  • Experience with cybersecurity detection and response, building classifiers and anomaly detection systems at scale, taking safety defenses or mitigations from research concepts to scalable production systems.
  • Experience in adversarial machine learning, automated red-teaming, or model interpretability and probes.
  • Experience collaborating on or leading applied ML projects, including LLM training, inference, and fine-tuning.
  • Experience using AI coding agents with strong architectural judgment and with TPUs and JAX.
  • Knowledge of AI control, chain-of-thought monitoring, monitorability, and related frontier safety research.

What the JD emphasized

  • defending against misuse domains
  • critical part of the overall strategy for building safe AI
  • build safety mitigations for frontier models
  • building defenses against risks
  • automated evaluation
  • rapid model iteration
  • novel signal aggregation methods
  • large-scale attack vectors
  • data-driven, semi-automated account-level response systems
  • persistent malicious actors
  • rich signals from production traffic
  • Evaluate and secure agentic AI systems
  • threat models
  • testing environments
  • robust mitigations
  • frontier-level agentic hacking
  • long-horizon attacks
  • automated red-teaming
  • adversarial robustness
  • multi-turn/agentic attacks
  • misuse vulnerabilities
  • cybersecurity detection and response
  • anomaly detection systems at scale
  • scalable production systems
  • adversarial machine learning
  • automated red-teaming
  • model interpretability and probes
  • LLM training, inference, and fine-tuning
  • AI coding agents
  • TPUs and JAX
  • AI control
  • chain-of-thought monitoring
  • monitorability
  • frontier safety research

Other signals

  • defending against misuse domains
  • build evaluations
  • red-teaming
  • deploy mitigations
  • monitor emerging risks
  • build safety mitigations for frontier models
  • build advanced classifiers and data pipelines to detect misuse
  • build cross-context monitoring systems to detect coordinated harms
  • Implement data-driven, semi-automated account-level response systems
  • Evaluate and secure agentic AI systems
  • developing threat models
  • creating testing environments
  • deploying robust mitigations against frontier-level agentic hacking and long-horizon attacks
  • advance research in automated red-teaming and adversarial robustness
  • leveraging multi-turn/agentic attacks to systematically test for and uncover misuse vulnerabilities
  • Experience with cybersecurity detection and response
  • building classifiers and anomaly detection systems at scale
  • taking safety defenses or mitigations from research concepts to scalable production systems
  • Experience in adversarial machine learning, automated red-teaming, or model interpretability and probes
  • Knowledge of AI control, chain-of-thought monitoring, monitorability, and related frontier safety research