Research Engineer, Frontier Safety Loss of Control, Deepmind

Google Google · Big Tech · San Francisco, CA +1

Research Engineer focused on developing monitoring and control systems for potentially misaligned AI agents to mitigate risks of extreme harms. This involves designing, building, and testing monitors, implementing response policies, and conducting adversarial testing. The role emphasizes preparing for the internal use of advanced AI systems and ensuring safety and ethics.

What you'd actually do

  1. Identify potential harms from misaligned agents and develop strategies for detection and prevention.
  2. Implement technical controls to monitor agent thoughts, behaviour, and respond to mitigate potential harms.
  3. Integrate various agent behaviour signals from across the organisation to inform response policies.
  4. Conduct adversarial testing of controls.
  5. Work with internal product teams to ensure that control systems are adopted over all high-risk AI surfaces.

Skills

Required

  • software development in Python
  • engineering and agentic assistance
  • frontier AI research and development environment
  • professional software engineering or research team environment
  • technical stakeholders
  • frontier model risk

Nice to have

  • engineering or product design for AI tools or assistants, especially those focused on ML Research and Development (R&D)
  • cybersecurity detection and response
  • collaborating or leading an applied ML project
  • Large Language Model (LLM) training and inference
  • AI control, chain-of-thought and other monitoring, faithfulness and monitorability and related research areas

What the JD emphasized

  • frontier AI research and development environment
  • frontier model risk

Other signals

  • developing and implementing response policies to preserve AI usefulness while mitigating risks
  • foreseeing ways in which our control tools might be bypassed or degraded
  • building defense-in-depth against AI that might persistently pursue goals that users and system developers did not intend
  • Identify potential harms from misaligned agents and develop strategies for detection and prevention
  • Implement technical controls to monitor agent thoughts, behaviour, and respond to mitigate potential harms
  • Conduct adversarial testing of controls
  • work with internal product teams to ensure that control systems are adopted over all high-risk AI surfaces
  • Experience working in a frontier AI research and development environment
  • Experience in frontier model risk