Research Engineer, Machine Learning

Mistral AI Mistral AI · AI Frontier · Palo Alto, CA · Research

Research Engineer focused on building and optimizing large-scale learning systems for open-weight models, working with Research Scientists to enhance training frameworks, data pipelines, and cluster tooling, or to integrate cutting-edge research into production-grade components. The role involves conducting experiments on deep learning techniques, designing and implementing ML algorithms, and delivering prototypes for products like Le Chat and enterprise APIs.

What you'd actually do

  1. Accelerate researchers by taking on the heavy parts of large-scale ML pipelines and building robust tools.
  2. Interface cutting-edge research with production: integrate checkpoints, streamline evaluation, and expose APIs.
  3. Conduct experiments on the latest deep-learning techniques (sparsified 70 B + runs, distributed training on thousands of GPUs).
  4. Design, implement and benchmark ML algorithms; write clear, efficient code in Python.
  5. Deliver prototypes that become production-grade components for _Le Chat_ and our enterprise API.

Skills

Required

  • Python
  • PyTorch, JAX or TensorFlow
  • distributed training (DeepSpeed / FSDP / SLURM / K8s)
  • deep learning, NLP or LLMs
  • software-design instincts: testing, code review, CI/CD

Nice to have

  • CUDA
  • data-pipeline chops

What the JD emphasized

  • large-scale ML codebases
  • distributed training

Other signals

  • large-scale ML pipelines
  • distributed training on thousands of GPUs
  • latest deep-learning techniques
  • turn fresh ideas into repeatable, scalable code