Research Engineer, Frontier Safety Mitigations, Deepmind

Google Google · Big Tech · San Francisco, CA +2

Research Engineer focused on building and deploying advanced safety mitigations for frontier AI models, specifically defending against misuse in domains like Cybersecurity and CBRNE. The role involves creating novel evaluations, red-teaming, implementing robust defenses, and monitoring risks, with a strong emphasis on agentic AI systems and adversarial robustness.

What you'd actually do

  1. Build advanced classifiers and data pipelines to detect misuse, owning the end-to-end process from automated evaluation to rapid model iteration.
  2. Build cross-context monitoring systems to detect coordinated harms, developing novel signal aggregation methods across disparate user sessions to identify large-scale attack vectors.
  3. Implement data-driven, semi-automated account-level response systems to detect, track, and apply strikes against persistent malicious actors using rich signals from production traffic.
  4. Evaluate and secure agentic AI systems by developing threat models, creating testing environments, and deploying robust mitigations against frontier-level agentic hacking and long-horizon attacks.
  5. Be able to advance research in automated red-teaming and adversarial robustness, leveraging multi-turn/agentic attacks to systematically test for and uncover misuse vulnerabilities.

Skills

Required

  • software development
  • software design and architecture
  • research-to-deployment pipeline in a frontier AI environment
  • building classifiers
  • anomaly detection systems at scale
  • taking safety defenses or mitigations from research concepts to scalable production systems
  • applied ML projects
  • LLM training
  • inference
  • fine-tuning
  • AI coding agents
  • TPUs
  • JAX
  • AI control
  • chain-of-thought monitoring
  • faithfulness
  • monitorability
  • adversarial machine learning
  • automated red-teaming
  • model interpretability
  • probes

Nice to have

  • PhD in Computer Science or Machine Learning
  • publications at venues such as NeurIPS, ICLR, ICML, or EMNLP
  • cybersecurity detection and response
  • strong architectural judgment

What the JD emphasized

  • critical misuse domains
  • advanced mitigations
  • highly robust
  • tangibly dangerous model capabilities
  • proactively researching and implementing robust, defense-in-depth mitigations
  • frontier models
  • end-to-end defenses
  • Frontier Safety Framework commitments
  • safety mitigations
  • frontier AI environment
  • scalable production systems
  • frontier safety research
  • frontier-level agentic hacking
  • long-horizon attacks

Other signals

  • building novel evaluations
  • red-teaming
  • deploying advanced mitigations
  • monitoring emerging risks
  • building the next generation of safety mitigations
  • highly applied and focuses on building robust, end-to-end defenses
  • advancing research in automated red-teaming and adversarial robustness
  • leveraging multi-turn/agentic attacks