Researcher, Agentic Post-training

OpenAI OpenAI · AI Frontier · San Francisco, CA · Research

Researcher focused on post-training agentic models, developing horizontal improvements for factuality, instruction following, tool use, and multi-agent collaboration. The role involves building and improving training/evaluation infrastructure and creating evals to ensure models are ready for shipment, directly impacting models used by millions.

What you'd actually do

  1. Own end-to-end research and engineering projects that improve the final post-training of OpenAI’s agentic models.
  2. Decide, together with partner teams, which integrations are ready for inclusion in major model runs.
  3. Develop horizontal model improvements across factuality, instruction following, tool/function calling, multi-agent behavior, reasoning-effort calibration, and other broad capabilities.
  4. Build and improve training, evaluation, grading, and data infrastructure for large-scale RL/post-training runs.
  5. Create evals and diagnostics that help us understand whether a model is ready to ship.

Skills

Required

  • ML fundamentals
  • LLMs
  • RL
  • RLHF
  • post-training
  • evals
  • model training
  • complex systems
  • pragmatic technical decisions
  • ambiguous problems end-to-end
  • impact over method
  • model behavior taste
  • research
  • infrastructure
  • data
  • evals
  • product boundaries

Nice to have

  • large-scale model training
  • RL systems
  • graders
  • reward models
  • data pipelines for LLM training
  • coding agents
  • tool-using agents
  • browser/computer-use agents
  • function calling
  • multi-agent systems
  • quant
  • systems
  • infra
  • reliable machinery for high-stakes experimentation
  • product taste
  • writing
  • design
  • code generation
  • agent workflows

What the JD emphasized

  • agentic models
  • post-training
  • tool use
  • multi-agent collaboration
  • evals
  • large-scale RL/post-training runs
  • frontier agentic models

Other signals

  • Post-training agentic models
  • Horizontal model improvements
  • Large-scale RL/post-training runs
  • Frontier agentic models