Staff Research Scientist | Language AI

DeepL DeepL · AI Frontier · Cologne · Research

Staff Research Scientist focused on leading scientific innovation for DeepL's core text translation products. The role involves defining long-term scientific strategy, prototyping, running large-scale experiments, and driving breakthroughs into production. Responsibilities include designing, training, and optimizing LLM-based translation models, owning the full lifecycle of model delivery (prototyping, training, evaluation, optimization, deployment), and working with engineering teams on integration, reliability, and quality at scale. The role also emphasizes improvements in inference efficiency, model serving, voice UX, and robustness, as well as establishing practices for evaluation, reproducibility, monitoring, and continuous improvement. Mentoring researchers and engineers is also a key aspect.

What you'd actually do

  1. Lead hands-on research and development or our core text translation products.
  2. Drive innovation on methods relevant to achieving perfect translation quality as well as translation quality estimation and evaluation.
  3. Design, train, and optimize large-scale, LLM-based translation models for multilingual accuracy, context awareness, robustness, and varying latency demands.
  4. Own the full lifecycle of model delivery: prototyping, ablations, training, evaluation, optimization, and production deployment.
  5. Work closely with engineering teams to integrate models into real-time systems, ensuring reliability, uptime, and quality at scale.

Skills

Required

  • Deep expertise in foundations and application of machine translation.
  • A hands-on builder who enjoys training models, running experiments, debugging pipelines, and integrating ML systems into production.
  • Experience shipping ML models to production, maintaining them at scale, and working with engineers on deployment, monitoring, and serving.
  • Ability to lead complex research efforts while staying grounded in product impact, user experience, and real-world performance.
  • Strong coding and experimentation skills (Python, PyTorch/JAX, audio processing libraries).
  • Ability to communicate clearly, collaborate across teams, and align research work with product and engineering priorities.
  • Proven experience mentoring others and elevating technical quality across a fast-moving, applied research team.

What the JD emphasized

  • foundational challenges
  • perfect translation quality
  • LLM-based translation models
  • multilingual accuracy
  • context awareness
  • robustness
  • varying latency demands
  • full lifecycle of model delivery
  • prototyping
  • ablations
  • training
  • evaluation
  • optimization
  • production deployment
  • real-time systems
  • reliability
  • uptime
  • quality at scale
  • inference efficiency
  • model serving
  • voice UX
  • robustness to real-world acoustic conditions
  • evaluation
  • reproducibility
  • monitoring
  • continuous model improvement
  • production
  • foundations and application of machine translation
  • shipping ML models to production
  • maintaining them at scale
  • deployment
  • monitoring
  • serving
  • user experience
  • real-world performance
  • Python
  • PyTorch/JAX
  • audio processing libraries
  • product and engineering priorities
  • technical quality

Other signals

  • LLM-based translation models
  • multilingual accuracy
  • context awareness
  • robustness
  • varying latency demands
  • inference efficiency
  • model serving
  • voice UX
  • acoustic conditions
  • evaluation
  • reproducibility
  • monitoring
  • continuous model improvement