Senior Staff Research Scientist

DeepL DeepL · AI Frontier · Cologne · Research

DeepL is seeking a Senior Staff Research Scientist to lead the design, implementation, and deployment of cutting-edge research in reinforcement learning and post-training at scale for large language models. The role involves shaping scientific strategy, building and deploying RL pipelines, post-training multi-modal models for alignment and general capabilities, and driving the research-to-production lifecycle. The ideal candidate has a PhD, 5+ years of ML research experience, strong expertise in deep reinforcement learning, and hands-on experience scaling and deploying foundation models.

What you'd actually do

  1. Shape scientific strategy and vision for post-training across DeepL, identifying high-leverage directions, setting technical standards and leading research initiatives in a highly dynamic, fast-paced environment
  2. Build and deploy state-of-the-art reinforcement learning pipelines at scale
  3. Post-train large (multi-modal) models to align them with human intent and enable general capabilities such as reasoning, pushing the boundaries of model performance, safety, and efficiency
  4. Drive the entire lifecycle of research and production: from idea conception, theoretical modeling, prototyping, ablation studies, all the way to production deployment
  5. Work closely with cross-functional leadership to shape the technical strategy and priorities of the foundational models research track

Skills

Required

  • PhD (or equivalent experience) in Computer Science, Machine Learning, Applied Mathematics, Physics or a related field
  • 5+ years of experience in ML research, including several years leading high-impact projects with responsibilities extending across different teams
  • Strong expertise in deep reinforcement learning (RLHF/RLAIF/RLVR)
  • Hands-on experience scaling and deploying LLMs or other foundation models in real-world systems
  • Strong programming skills and experience working with large compute clusters and ML infrastructure
  • Excellent communication skills — able to clearly explain complex topics to diverse audiences
  • Track record of mentoring other scientists and setting technical long-term vision

Nice to have

  • multi-modal models

What the JD emphasized

  • proven track record of driving research in reinforcement learning or large-scale model alignment to production
  • Track record of mentoring other scientists and setting technical long-term vision

Other signals

  • post-training stack for large language models
  • reinforcement learning
  • align pre-trained models with tasks and performance goals
  • integrate cutting-edge ideas into our core stack
  • enable new capabilities, better controllability, and safer, more effective user experiences