Researcher, Interpretability

OpenAI OpenAI · AI Frontier · San Francisco, CA · Safety Systems

Researcher focused on studying internal representations of deep learning models to understand model behavior and engineer more understandable representations, with a focus on AI safety and ensuring the safety of powerful AI systems. The role involves developing and publishing research, engineering infrastructure for studying model internals, and collaborating across teams.

What you'd actually do

  1. Develop and publish research on techniques for understanding representations of deep networks.
  2. Engineer infrastructure for studying model internals at scale.
  3. Collaborate across teams to work on projects that OpenAI is uniquely suited to pursue.
  4. Guide research directions toward demonstrable usefulness and/or long-term scalability.

Skills

Required

  • Python
  • deep learning
  • mechanistic interpretability
  • AI safety research

Nice to have

  • Ph.D.
  • quantitative reasoning
  • engineering

What the JD emphasized

  • strong background in engineering
  • quantitative reasoning
  • research process
  • mechanistic interpretability
  • AI safety
  • safe AGI
  • large-scale AI systems
  • research engineering experience

Other signals

  • interpretability
  • AI safety
  • deep learning models