Lead Reserach Scientist, AI Safety and Reinforcement Learning

AMD AMD · Semiconductors · Santa Clara, CA · Engineering

Lead Research Scientist focused on recursive self-improvement (RSI) in AI systems, researching how models, data generators, or toolchains can improve their own training signals, curricula, or verification under explicit governance and human oversight. The role involves evaluating when RSI is beneficial versus when it amplifies bias or reward hacking, and designing measurement and containment for safe and auditable RSI pipelines for AMD's AI programs.

What you'd actually do

  1. Research self-improving training loops: model-generated supervision, iterative distillation, self-critique, and automated curriculum updates with clear scope limits
  2. Develop theory- and systems-grounded evaluations for capability drift, Goodhart effects, and distributional shift in closed-loop training
  3. Partner with RL scientists on where RSI-style objectives intersect policy optimization and preference learning
  4. Define red-team protocols and monitoring for RSI pilots; document rollback criteria before experiments touch shared infrastructure
  5. Publish or produce technical reports where appropriate; align internal narrative with responsible deployment standards

Skills

Required

  • Machine learning
  • AI safety
  • Reinforcement learning
  • Iterative training
  • Self-training
  • Open-ended learning
  • Empirical safety evaluation
  • Scalable oversight
  • Stress-testing of generative model training pipelines
  • Software skills for building controlled experimental harnesses
  • Reproducible RSI microcosms

Nice to have

  • publications
  • PhD in Computer Science, Machine Learning, or related field

What the JD emphasized

  • explicit governance
  • human oversight
  • auditable and safe
  • skeptical by default but constructive
  • formalize assumptions
  • bound autonomy
  • insist on counterfactual evaluation
  • concrete metrics
  • empirical safety evaluation
  • scalable oversight
  • stress-testing
  • controlled experimental harnesses
  • reproducible RSI microcosms

Other signals

  • recursive self-improvement
  • AI safety
  • reinforcement learning
  • bounded autonomy
  • human oversight
  • auditable and safe pipelines