Systems Engineer, Hpc

Mistral AI Mistral AI · AI Frontier · Paris, France · Engineering & Infra

Systems Engineer/Administrator role focused on designing, operating, and scaling the HPC infrastructure for AI model training and research. Responsibilities include Linux systems administration, automation, performance tuning, and supporting production/research workloads.

What you'd actually do

  1. Operate and maintain large-scale Linux environments (bare metal, clusters, cloud)
  2. Monitor system health, troubleshoot incidents, and ensure high availability
  3. Support production and research workloads across multiple environments
  4. Help scale clusters toward hundreds to thousands of nodes
  5. Work on systems handling petabyte-scale storage

Skills

Required

  • Linux systems administration
  • Large-scale environments (HPC clusters or cloud)
  • Job schedulers (e.g. Slurm)
  • Troubleshooting systems, hardware, and networks

Nice to have

  • Containers / orchestration (e.g. Kubernetes)
  • Storage systems (e.g. Ceph, Lustre, NFS)
  • Networking fundamentals (Ethernet; InfiniBand is a plus)
  • Infrastructure as Code / automation tooling
  • GPU or AI/ML experience

What the JD emphasized

  • Strong Linux systems administration experience (core requirement)
  • Experience working in large-scale environments
  • HPC clusters or cloud infrastructure
  • GPU or AI/ML experience