ML Performance Benchmarking Engineer

Cerebras · Semiconductors · Toronto, ON · Software

ML Performance Benchmarking Engineer role focused on optimizing AI inference performance on Cerebras' wafer-scale architecture. Responsibilities include building observability and benchmarking infrastructure, performance analysis, and integrating new inference features. Requires strong Python/C++ and infrastructure scaling experience, with a focus on complex, large-scale systems.

What you'd actually do

  1. Design and implement end-to-end telemetry systems across the software stack, providing deep visibility into inference performance and enabling rapid iteration before and after deployment.
  2. Architect, build, and scale the automation that generates, analyzes, and visualizes performance data used to inform business decisions across engineering and leadership.
  3. Dive deep into system behavior, dissect performance bottlenecks, and deliver actionable insights that directly influence which features ship and how they evolve.
  4. Partner closely with Core Platform teams to define rigorous testing methodologies that validate inference features for peak performance.

Skills

Required

  • Python
  • C++
  • building and scaling automated infrastructure
  • throughput and performance optimization techniques
  • complex, large-scale systems
  • problem-solving
  • analytical mindset
  • dive deep into new domains

Nice to have

  • hardware and software intersection
  • AI workloads and architectures

What the JD emphasized

  • low-latency, high-speed, high-throughput deployment
  • performance and scalability
  • performance improvements
  • performance
  • performance optimization

Other signals

  • ML Performance Benchmarking
  • Inference Core Platform
  • low-latency, high-speed, high-throughput deployment
  • model compilation and scheduling
  • custom hardware kernels and driver development
  • Core Inference Observability
  • Benchmarking Infrastructure
  • Performance Analysis
  • Feature Integration