Research Lead / Principal Scientist & Manager Post-training · Alignment · Reinforcement Learning Autodesk AI Lab: Toronto · Remote (ca)

Autodesk Autodesk · Enterprise · Toronto, ON +4 · Remote

Research Lead/Principal Scientist & Manager for Post-Training, Alignment, and Reinforcement Learning at Autodesk AI Lab. The role focuses on transforming foundation models into reliable, aligned, and useful systems for complex, domain-specific workflows in industries like architecture, engineering, and construction. Responsibilities include leading research strategy, developing novel algorithms, managing a team of AI scientists, designing evaluation frameworks, and contributing to publications. The position leverages unique assets like physics simulation engines for grounded reinforcement learning.

What you'd actually do

  1. Own post-training strategy for model development — from RLHF and preference optimization to agentic systems and long-horizon reasoning
  2. Design evaluation frameworks for long-horizon reasoning, tool use, agentic behavior, safety, and real-world workflow completion
  3. Manage, mentor, and grow a team of AI scientists
  4. Develop novel algorithms that improve model reliability, controllability, and alignment
  5. Make principled architectural decisions about when to address challenges at the pre-training, post-training, or system level

Skills

Required

  • reinforcement learning for foundation models
  • post-training methods (RLHF, RLAIF, DPO, PPO)
  • leading or mentoring technical research teams
  • intuition for model behavior, alignment challenges, and post-training trade-offs
  • designing evaluation systems
  • communicating complex technical trade-offs
  • PhD

Nice to have

  • human-in-the-loop evaluation
  • model analysis and interpretability

What the JD emphasized

  • post-training
  • alignment
  • reinforcement learning
  • foundation models
  • long-horizon reasoning
  • agentic systems
  • evaluation frameworks
  • model readiness criteria
  • PhD

Other signals

  • post-training
  • alignment
  • reinforcement learning
  • foundation models
  • evaluating model quality
  • long-horizon reasoning
  • agentic systems