AI Research Manager/scientist, Reinforcement Learning

Autodesk Autodesk · Enterprise · San Francisco, CA +7 · Remote

Autodesk Research is seeking an AI Scientist Manager to lead post-training and model alignment efforts. This role involves managing and growing a team of AI scientists while also contributing as a hands-on researcher. Responsibilities include leading instruction tuning, preference optimization (RLHF, RLAIF, DPO, PPO), and domain-specific post-training. The role also involves designing and maintaining evaluation frameworks for reasoning, tool-use, safety, and robustness, and providing go/no-go recommendations for model releases. The position requires a PhD or equivalent experience, proven people management skills, and expertise in LLMs, fine-tuning, and experimental design.

What you'd actually do

  1. Lead and contribute directly to post-training pipelines, including: Instruction tuning and multi-task fine-tuning, Preference optimization (RLHF, RLAIF, DPO, PPO, and related methods), Domain-specific post-training and specialization for the AECO, Manufacturing, and Media & Entertainment industries
  2. Design and run experiments that shape _model_ behavior, robustness, and reliability
  3. Design and maintain evaluation frameworks that measure: Long-horizon reasoning and planning, Tool-use and agentic behavior, Safety, robustness, and alignment, Regression and behavioral drift across releases
  4. Lead human-in-the-loop evaluation, ensuring annotation quality, consistency, and bias awareness
  5. Manage, mentor, and grow a team of AI scientists working on post-training and alignment

Skills

Required

  • PhD or equivalent industry experience in Machine Learning, AI, or a related field
  • Proven experience as a people manager of technical research or ML teams
  • Strong hands-on expertise in Large language models or foundation models
  • Fine-tuning and post-training methods (e.g., RLHF, DPO, instruction tuning)
  • Experimental design and evaluation
  • Ability to move fluidly between research depth and organizational leadership
  • Strong communication skills, with the ability to explain complex trade-offs to technical and non-technical audiences

Nice to have

  • Experience operating in an AI research lab or frontier model organization
  • Background in human-in-the-loop systems, preference learning, or alignment research
  • Experience shipping or supporting production AI systems
  • Familiarity with large-scale training infrastructure and compute cost trade-offs
  • Experience in Architecture, Civil or Mechanical Engineering, Construction, Manufacturing, Media & Entertainment or other Autodesk domains

What the JD emphasized

  • model alignment
  • model readiness
  • model behavior
  • model releases
  • post-training

Other signals

  • leading post-training and model alignment efforts
  • managing and growing a team of AI scientists
  • hands-on researcher
  • transformation of foundation models into reliable, aligned, and production-ready systems
  • publications at top-tier conferences
  • design and maintain evaluation frameworks
  • human-in-the-loop evaluation
  • provide clear go / no-go recommendations for model releases