Researcher, Alignment

OpenAI OpenAI · AI Frontier · San Francisco, CA · Research

Research role focused on ensuring AI systems are safe, trustworthy, and aligned with human values, developing methodologies for robust intent following, and integrating human oversight.

What you'd actually do

  1. Develop and evaluate alignment capabilities that are subjective, context-dependent, and hard to measure.
  2. Design evaluations to reliably measure risks and alignment with human intent and values.
  3. Build tools and evaluations to study and test model robustness in different situations.
  4. Design experiments to understand laws for how alignment scales as a function of compute, data, lengths of context and action, as well as resources of adversaries.
  5. Design and evaluate new Human-AI-interaction paradigms and scalable oversight methods that redefine how humans interact with, understand, and supervise our models.

Skills

Required

  • PhD or equivalent experience in research in computer science, computational science, data science, cognitive science, or similar fields.
  • Strong engineering skills, particularly in designing and optimizing large-scale machine learning systems (e.g., PyTorch).
  • Deep understanding of the science behind alignment algorithms and techniques.
  • Ability to develop data visualization or data collection interfaces (e.g., TypeScript, Python).

Nice to have

  • Team player
  • Enjoy fast-paced, collaborative, and cutting-edge research environments.
  • Focus on developing AI models that are trustworthy, safe, and reliable, especially in high-stakes scenarios.

What the JD emphasized

  • alignment
  • human intent
  • human oversight
  • evaluations
  • risks

Other signals

  • AI safety
  • AI alignment
  • human values
  • AI supervision
  • AI control