Researcher, Pretraining Safety

OpenAI OpenAI · AI Frontier · San Francisco, CA · Safety Systems

Researcher focused on pretraining safety for AI models, developing upstream safety evaluations, creating safer priors through targeted pretraining, and designing safe-by-design architectures. The role involves identifying safety-relevant behaviors in early-stage models and reducing risk during training.

What you'd actually do

  1. Develop new techniques to predict, measure, and evaluate unsafe behavior in early-stage models
  2. Design data curation strategies that improve pretraining priors and reduce downstream risk
  3. Explore safe-by-design architectures and training configurations that improve controllability
  4. Introduce novel safety-oriented loss functions, metrics, and evals into the pretraining stack
  5. Work closely with cross-functional safety teams to unify pre- and post-training risk reduction

Skills

Required

  • developing or scaling pretraining architectures (LLMs, diffusion models, multimodal models, etc.)
  • working with training infrastructure, data pipelines, and evaluation frameworks (e.g., Python, PyTorch/JAX, Apache Beam)
  • designing, implementing, and iterating on experiments
  • data-driven with strong statistical reasoning and rigor in experimental design
  • building clean, scalable research workflows and streamlining processes

Nice to have

  • collaborating with diverse technical and cross-functional partners (e.g., policy, legal, training)

What the JD emphasized

  • pretraining
  • safety
  • evaluations
  • evaluate
  • risk
  • training

Other signals

  • Develop upstream safety evaluations
  • Create safer priors through targeted pretraining and mid-training interventions
  • Design safe-by-design architectures
  • Identify safety-relevant behaviors as they first emerge in base models
  • Evaluate and reduce risk without waiting for full-scale training runs