ML Systems Performance Engineer

Cerebras Cerebras · Semiconductors · India · Software

ML Systems Performance Engineer role focused on optimizing end-to-end model inference speed and throughput on Cerebras' AI chip. Responsibilities include performance modeling, kernel optimization, system-level debugging, and developing performance tooling. Requires a strong background in computer architecture, low-level deep learning math, and performance engineering with C++ and Python.

What you'd actually do

  1. Build performance models (kernel-level, end-to-end) to estimate the performance of state of the art and customer ML models.
  2. Optimize and debug our kernel micro code and compiler algorithms to elevate ML model inference speed, throughput and compute utilization on the Cerebras WSE.
  3. Debug and understand runtime performance on the system and cluster.
  4. Develop tools and infrastructure to help visualize performance data collected from the Wafer Scale Engine and our compute cluster.

Skills

Required

  • Bachelors / Masters / PhD in Electrical Engineering or Computer Science
  • Strong background in computer architecture
  • Exposure to and understanding of low-level deep learning / LLM math
  • Strong analytical and problem-solving mindset
  • 3+ years of experience in a relevant domain (Computer Architecture, CPU/GPU Performance, Kernel Optimization, HPC)
  • Experience working on CPU/GPU simulators
  • Exposure to performance profiling and debug on any system pipeline
  • Comfort with C++ and Python

What the JD emphasized

  • end-to-end model inference speed and throughput
  • low-level kernel performance debugging and optimization
  • system-level performance analysis
  • performance modeling and estimation
  • development of tooling for performance projection and diagnostics
  • low-level deep learning / LLM math
  • CPU/GPU simulators
  • performance profiling and debug

Other signals

  • Optimizing inference speed and throughput
  • Low-level kernel performance debugging and optimization
  • System-level performance analysis
  • Performance modeling and estimation
  • Development of tooling for performance projection and diagnostics