Datacenter Hardware Engineer, Hpc

Mistral AI Mistral AI · AI Frontier · Paris, France · Engineering & Infra

Mistral AI is seeking a Datacenter Hardware Engineer to maintain, troubleshoot, and scale their GPU/CPU clusters in their Paris-area datacenter. The role involves hands-on hardware work, diagnostics, preventive maintenance, parts management, and collaboration with various teams to ensure the health and reliability of one of France's largest GPU clusters, which is critical for enabling AI research and development.

What you'd actually do

  1. Diagnose & operate core server/cluster components - Investigate and handle compute/storage hardware issues (CPU, memory, drives, NICs, GPUs, PSUs) and interconnect problems (switches, cables, transceivers; Ethernet/InfiniBand). Perform safe interventions (power-off/lockout, ESD) to replace, re-seat, or recable components and restore service.
  2. Safety & procedures - Apply lockout/tagout (LOTO) and ESD discipline; follow pre/post-work checklists; maintain tidy, safe work areas.
  3. First-line diagnostics - Triage using LEDs, POST, beep codes and basic tests; capture evidence (photos, serials, results); open/update/close tickets with clear notes.
  4. Preventive maintenance - Provide feedback and ideas to improve proactive activities, monitoring, and targeted follow-ups on recurring or specific anomalies; help turn ad-hoc checks into SOPs, alerts, and dashboards.
  5. Parts & logistics - Receive and track parts, keep labeled inventory accurate, manage simple RMAs, and coordinate with vendors.

Skills

Required

  • Hands-on datacenter/server hardware experience
  • Install/re-seat/swap GPU/PCIe cards, NICs, PSUs, drives
  • Work cleanly in racks (rails, cabling, labeling)
  • Linux fundamentals (boot/check, logs)
  • Disciplined and meticulous
  • Follows checklists
  • ESD/LOTO
  • Practical electrical basics (power-off, PPE, short-circuit risk awareness)
  • Comfortable in racks (cooling, network, storage, PDU, cable management)
  • Clear communicator
  • Punctual and process-minded
  • Hardware-passionate

Nice to have

  • HPC/AI/Cloud at scale experience
  • Large-fleet/server install & maintenance in datacenters
  • Basic networking (Ethernet/InfiniBand)
  • Basic Linux (boot/check)
  • Coding/automation skills (Python/Bash)
  • Experience with inventory/RMA tools and vendor coordination
  • Exposure to HPC/research/industrial environments

What the JD emphasized

  • GPU clusters
  • scale
  • groundbreaking AI solutions
  • GPU/PCIe cards
  • ESD
  • lockout/tagout (LOTO)
  • all high-value server components