Researcher, Robustness & Safety Training

OpenAI OpenAI · AI Frontier · San Francisco, CA · Safety Systems

Researcher focused on AI safety, specifically in training models for robustness, alignment, and adversarial resistance using techniques like RLHF. The role involves setting research directions and implementing improvements in OpenAI's products.

What you'd actually do

  1. Conduct state-of-the-art research on AI safety topics such as RLHF, adversarial training, robustness, and more.
  2. Implement new methods in OpenAI’s core model training and launch safety improvements in OpenAI’s products.
  3. Set the research directions and strategies to make our AI systems safer, more aligned and more robust.
  4. Coordinate and collaborate with cross-functional teams, including T&S, legal, policy and other research teams, to ensure that our products meet the highest safety standards.
  5. Actively evaluate and understand the safety of our models and systems, identifying areas of risk and proposing mitigation strategies.

Skills

Required

  • AI safety research
  • RLHF
  • adversarial training
  • robustness
  • alignment
  • deep learning research
  • collaboration

Nice to have

  • engineering skills

What the JD emphasized

  • 4+ years of experience in the field of AI safety
  • safety work for AI model deployment

Other signals

  • AI safety research
  • RLHF
  • adversarial training
  • robustness
  • alignment