Senior ML Systems Engineer

Cerebras · Semiconductors · US and Canada Offices · Software

Senior ML Systems Engineer to join the SOTA Training Platform team, responsible for bringing up state-of-the-art open-source and proprietary ML models on Cerebras CSX systems. This role involves working across the full stack, including model architecture translation, graph lowering, compiler optimizations, runtime integration, and performance tuning, with a focus on debugging and improving the bring-up process.

What you'd actually do

  1. Contribute to the end-to-end bring up of ML models on Cerebras CSX systems.
  2. Work across the stack: model architecture translation, graph lowering, compiler optimizations, runtime integration, and performance tuning.
  3. Debug performance and correctness issues spanning model code, compiler IRs, runtime behavior, and hardware utilization.
  4. Propose and prototype improvements across tools, APIs, or automation flows to accelerate future bring ups.
  5. Study emerging training and post-training algorithms and map to Cerebras software architecture and hardware.

Skills

Required

  • Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related field.
  • 5+ years of relevant industry experience (internship/co-op experience included)
  • Comfort navigating the full AI toolchain: Python modeling code, compiler IRs, performance profiling, etc.
  • Strong debugging skills across performance, numerical accuracy, and runtime integration.
  • Experience with deep learning frameworks (e.g., PyTorch, TensorFlow) and familiarity with model internals (e.g., attention, MoE, diffusion).
  • Proficiency in C/C++ programming and experience with low-level optimization.
  • Proven experience in compiler development, particularly with LLVM and/or MLIR.
  • Strong background in optimization techniques, particularly those involving NP-hard problems.
  • Familiarity with large scale ML systems and state of the art algorithms, including model training and reinforcement learning.

Nice to have

  • study emerging training and post-training algorithms

What the JD emphasized

  • full AI toolchain
  • compiler development
  • low-level optimization
  • optimization techniques, particularly those involving NP-hard problems

Other signals

  • bring up state-of-the-art open-source models
  • customer-provided proprietary models
  • Cerebras CSX systems
  • full AI toolchain
  • compiler optimizations
  • performance tuning