Advanced Technology: Ai/ml Research Scientist

Cerebras Cerebras · Semiconductors · Headquarters +1 · Software

The role focuses on building and scaling the inference core platform software and hardware infrastructure for Cerebras' AI chip, emphasizing low-latency, high-speed deployment. Responsibilities include designing telemetry systems for inference observability, architecting benchmarking infrastructure, analyzing performance bottlenecks, and integrating features with core platform teams.

What you'd actually do

  1. Core Inference Observability – Design and implement end-to-end telemetry systems across the software stack, providing deep visibility into inference performance and enabling rapid iteration before and after deployment.
  2. Benchmarking Infrastructure – Architect, build, and scale the automation that generates, analyzes, and visualizes performance data used to inform business decisions across engineering and leadership.
  3. Performance Analysis – Dive deep into system behavior, dissect performance bottlenecks, and deliver actionable insights that directly influence which features ship and how they evolve.
  4. Feature Integration – Partner closely with Core Platform teams to define rigorous testing methodologies that validate inference features for peak performance.

Skills

Required

  • Python
  • C++
  • building and scaling automated infrastructure
  • throughput and performance optimization techniques
  • complex, large-scale systems
  • problem-solving skills
  • analytical mindset
  • dive deep into new domains
  • fast-paced, ambiguous, and collaborative environment

Nice to have

  • hardware and software intersection
  • AI workloads and architectures

What the JD emphasized

  • low-latency, high-speed, high-throughput deployment
  • performance and scalability of AI inference
  • performance at scale

Other signals

  • inference performance
  • benchmarking infrastructure
  • telemetry systems