AI Hire Signal
JobsCompaniesTrendsInsightsWeekly
JobsStrategy timeline
AI Hire Signal

Tracking AI hiring across 200+ US tech companies. Stage, salary, and stack signals on every role — refreshed weekly.

Contact

Browse

JobsCompaniesTrendsInsightsWeekly

Resources

AboutSitemapRobots

Legal

PrivacyTerms
© 2026 AI Hire Signal·Not affiliated with companies shown

Cerebras currently has 38 active AI-related job listings. The majority of these roles, 79%, are focused on serving infrastructure. The top hiring function is Engineering, with 32 roles. The company is actively hiring in the United States and Canada. Frequent tech tags include model_serving and inference_infra. In the last 30 days, Cerebras posted 4 new AI roles, representing a 20% decrease compared to the previous 30-day period.

Auto-generated from active job postings · last refreshed 2026-05-24

Cerebras

Cerebras

Semiconductors · Wafer-scale AI chip

HQ
Sunnyvale, US
Founded
2016
Website
cerebras.net

Currently tracking 36 active AI roles, up 46% versus the prior 4 weeks. Primary focus: Serve · Engineering. Salary range $170k–$250k (avg $206k).

Hiring
36 / 36
Momentum (4w)
↑+6 +46%
19 opens last 4w · 13 prior 4w
Salary range · avg $206k
$170k–$250k
USD · disclosed roles only
Tracked since
Mar '24
last role 5w ago
Hiring velocityscroll left for older weeks
1 new role
Oct 23
1 new role
Mar 4
1 new role
Jul 8
1 new role
Mar 24
1 new role
Apr 7
1 new role
21
1 new role
Jul 14
1 new role
21
1 new role
Sep 8
1 new role
22
2 new roles
29
1 new role
Oct 6
1 new role
13
3 new roles
27
2 new roles
Nov 10
4 new roles
24
1 new role
Dec 8
1 new role
15
5 new roles
Jan 5
2 new roles
12
3 new roles
19
2 new roles
26
3 new roles
Feb 2
3 new roles
9
8 new roles
16
5 new roles
23
7 new roles
Mar 2
1 new role
9
2 new roles
16
2 new roles
23
4 new roles
30
6 new roles
Apr 6
5 new roles
13
4 new roles
27
6 new roles
May 4
2 new roles
11
1 new role
18
4 new roles
25
9 new roles
Jun 1
2 new roles
8
4 new roles
15
4 new roles
22

Frequently asked questions

  • What AI roles is Cerebras hiring for?

    Cerebras currently has 39 active AI-related roles in our index. The most common open titles are: Kernel Engineer (2), ML Systems Performance Engineer (2), LLM Inference Performance & Evals Engineer, AI Infrastructure Operations Engineer, AI Models, Product Manager. Most positions are in Engineering and Research.

  • What stage of AI development does Cerebras focus on?

    Cerebras's active AI hiring is concentrated in: serving infrastructure (85%), post-training (8%), pre-training (5%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.

  • Where is Cerebras hiring AI talent?

    Cerebras is hiring AI talent in: United States (23 roles), Canada (20 roles), India (6 roles), United Arab Emirates (3 roles).

  • What technologies does Cerebras's AI team work with?

    Job postings at Cerebras most frequently reference: model serving, inference infra, fine tuning, llm observability, frontier research.

  • How many AI roles has Cerebras posted recently?

    In the past 30 days, Cerebras has posted 4 new AI-related roles.

Jobs (32)

38 AI · 98 total active
FilteredStageServe×
Show
Active onlyAI only (≥ 7)
Stage
AllPretrain · 2Post-train · 3Serve · 32Ship · 1
Function
AllEngineering · 33Research · 4Product · 1
Country
AllUnited States · 23Canada · 20India · 5United Arab Emirates · 3
Sort
AI scoreRecentTitle
TitleStageFunctionLocationFirst seenAI score
Lead Full Stack Machine Learning Engineer
This role focuses on bringing up and optimizing open-source AI models and frameworks on Cerebras' wafer-scale hardware. It involves working across the full software stack, from model translation and compiler optimizations to runtime integration and performance tuning, with a strong emphasis on debugging and improving the bring-up process for future models.
ServePost-trainEngineeringIndia2w ago9
ML Research Engineer (Inference)
Research Engineer focused on adapting and optimizing advanced language and vision models for efficient inference on Cerebras' wafer-scale AI architecture. The role involves implementing, validating, and optimizing models for low-latency, high-throughput inference, with a focus on techniques like speculative decoding, pruning, compression, and sparsity.
ServeResearchIndiaApr 8
9
Advanced Technology: R&D Engineer - AI/ML, HPC
Research Engineer role focused on designing and implementing AI/ML workloads on Cerebras' wafer-scale hardware, optimizing performance, and contributing to future hardware/software roadmaps. Involves algorithm-hardware co-design, performance modeling, and publishing research.
ServeResearchHeadquarters +3Apr 69
Kernel Engineer
The Kernel Engineer will develop high-performance software solutions for AI and HPC workloads, focusing on implementing, optimizing, and scaling deep learning operations on Cerebras' custom hardware. This involves designing, developing, and debugging low-level kernels and algorithms to maximize compute utilization and training efficiency, while also studying emerging ML trends and interacting with hardware architects.
ServePost-trainEngineeringHeadquarters +2Feb 239
Staff Inference ML Runtime Engineer
Staff Inference ML Runtime Engineer at Cerebras Systems, focusing on optimizing and scaling their wafer-scale AI chip for high-throughput, low-latency generative AI inference. The role involves designing and implementing ML features, APIs, and distributed runtime solutions, working with state-of-the-art generative AI models and multimodal data.
ServeEngineeringHeadquarters +2Nov '259
Senior Runtime Engineer
Senior Runtime Engineer role at Cerebras, focusing on designing and developing high-performance distributed software for large-scale AI training and inference workloads on their wafer-scale architecture. The role involves optimizing compute and data pipelines, ensuring scalability, and collaborating with ML and compiler teams. Requires strong C++ and distributed systems experience, with familiarity in ML pipelines preferred.
ServeAgentEngineeringHeadquarters +2Oct '259
Kernel Engineer
Kernel Engineer role focused on developing and optimizing high-performance software for Cerebras' AI chip, specifically implementing and scaling deep learning operations and building parallel algorithms for training and inference. The role involves low-level programming, performance tuning, and interaction with hardware architects to maximize compute utilization and accelerate AI innovation.
ServePretrainEngineeringIndiaOct '259
LLM Inference Performance & Evals Engineer
Cerebras is seeking an LLM Inference Performance & Evals Engineer to optimize and validate state-of-the-art models on their wafer-scale AI hardware. The role involves prototyping architectural tweaks, building performance-evaluation pipelines, and collaborating with hardware and software teams to accelerate new model ideas and improve inference speeds.
ServeEval GateEngineeringToronto, ONJul '259
Full Stack LLM Engineer
Cerebras is seeking a Full Stack LLM Engineer to join their Inference Core Model Bringup team. This role involves bringing up state-of-the-art open-source and proprietary models on Cerebras CSX systems, working across the entire software stack from model translation and compiler optimizations to runtime integration and performance tuning. The engineer will debug performance and correctness issues and propose improvements to tools and automation. Experience with deep learning frameworks, model internals, C/C++, and compiler development (LLVM/MLIR) is required.
ServeEngineeringToronto, ONJul '259
ML Systems Performance Engineer
ML Systems Performance Engineer role focused on optimizing inference speed and throughput on Cerebras' custom wafer-scale AI chip. Responsibilities include building performance models, optimizing kernel microcode and compiler algorithms, debugging runtime performance, and developing performance visualization tools. Requires strong background in computer architecture, low-level deep learning math, and experience with performance profiling and optimization on CPU/GPU simulators.
ServeEngineeringIndia1w ago8
Senior Performance Engineer, Inference
Senior Performance Engineer focused on benchmarking Cerebras' AI inference performance against competitors and analyzing pricing models. Requires deep expertise in open-source inference stacks, GPU optimization, and LLM inference economics.
ServeEngineeringHeadquarters +1Apr 138
Engineering Manager, Inference ML Runtime
Engineering Manager for Inference ML Runtime at Cerebras, leading a team to design and scale systems for executing state-of-the-art AI models on Cerebras hardware. The role focuses on ML, distributed systems, and high-performance runtime engineering, with a goal of delivering the fastest Generative AI inference solution.
ServeEngineeringHeadquarters +2Mar 248
ML Performance Benchmarking Engineer
ML Performance Benchmarking Engineer role focused on optimizing AI inference performance on Cerebras' wafer-scale architecture. Responsibilities include building observability and benchmarking infrastructure, performance analysis, and integrating new inference features. Requires strong Python/C++ and infrastructure scaling experience, with a focus on complex, large-scale systems.
ServeEngineeringToronto, ONMar 188
Staff Kernel Optimzation Engineer
Staff Kernel Optimization Engineer role focused on developing and optimizing high-performance software for Cerebras' custom wafer-scale AI chip, specifically for deep learning operations and inference. This involves implementing and debugging low-level kernels, mapping algorithms to hardware, and studying emerging AI trends to evolve kernel library architecture. The role contributes to accelerating AI innovation and delivering industry-leading training and inference speeds.
ServePretrainEngineeringOffice, United Arab Emirates · RemoteFeb 58
ML Systems Performance Engineer
ML Systems Performance Engineer at Cerebras, focusing on optimizing end-to-end model inference speed and throughput on their wafer-scale AI chip. Responsibilities include kernel optimization, system performance analysis, and developing performance modeling and diagnostic tools.
ServeEngineeringHeadquarters +2Jan 218
Performance & Reliability Engineer
The Performance & Reliability Engineer will characterize and optimize the performance and reliability of advanced ML hardware/software systems, focusing on reducing power and thermal fluctuations. This role involves analyzing ML workloads, software kernels, and hardware architecture, developing software solutions for reliability and performance, and influencing next-generation AI architecture design.
ServeEngineeringHeadquarters +1Nov '258
Staff Python / PyTorch Developer — Frontend Inference Compiler – Dubai
Staff Python/PyTorch Developer for Frontend Inference Compiler at Cerebras, focusing on optimizing generative AI models for their wafer-scale AI chip. Responsibilities include developing compiler infrastructure, analyzing new models, and improving inference performance.
ServeEngineeringUnited Arab EmiratesOct '258
Software Engineer, Inference Platform
Software Engineer for Cerebras' Inference Platform team, focusing on the orchestration layer for inference on datacenter clusters. Responsibilities include shaping platform direction, ensuring reliability and performance of active-active systems, writing production code, leading production issues, and partnering with ML/Product/Infra teams. Requires 3+ years of experience in distributed systems, Kubernetes, and building highly available, latency-sensitive systems. Experience with ML inference infrastructure is a plus.
ServeAgentEngineeringHeadquarters +11w ago7
Staff Software Engineer, Inference Platform
Staff Software Engineer for Cerebras' Inference Platform team, focusing on the orchestration layer for datacenter clusters. Responsibilities include platform direction, reliability, performance, execution on critical paths, production leadership, and technical influence. Requires 8+ years of experience in distributed systems, Kubernetes, and backend languages, with a plus for ML inference infrastructure experience.
ServeAgentEngineeringHeadquarters +11w ago7
Member of Technical Staff (Software Engineer)
Software Engineer to implement and optimize high-performance, low-latency inference services on Cerebras' wafer-scale AI chip, focusing on Kubernetes deployment, resource management, and reliability. This role involves collaborating with ML engineers, debugging complex issues, and ensuring the scalability and fault tolerance of AI inference workloads.
ServeEngineeringHeadquarters +17w ago7
Sr. Member of Technical Staff
This role focuses on developing and maintaining cloud-based deployment workflows for AI inference software, utilizing containerization and orchestration technologies like Docker and Kubernetes. The responsibilities include ensuring system resiliency, high availability, and optimizing performance for low-latency inference tasks. The role also involves debugging, monitoring, and documenting inference services, with a strong emphasis on infrastructure-as-code and CI/CD practices.
ServeEngineeringHeadquarters +17w ago7
Advanced Technology: Compiler Engineer
Cerebras is seeking a Compiler Engineer to work on their Tungsten language compiler, which is purpose-built for their wafer-scale AI hardware. The role involves designing and implementing compiler passes, co-designing language constructs, and developing code generation strategies for AI and scientific workloads. The engineer will collaborate with ASIC, kernel, and AI teams, and contribute to the broader toolchain including runtime and debuggers. Experience with novel architectures and ML compiler frameworks is valuable.
ServeEngineeringHeadquarters +2Mar 307
QA Lead (ML Integration and Quality)
The QA Lead will be responsible for ensuring the quality of Cerebras' software across all supported ML workloads and workflows, focusing on feature testing, ML training accuracy and performance, and pre-deployment validation. This role involves driving quality, implementing testing methodologies, automating workflows, and debugging issues within a large-scale enterprise environment.
ServePost-trainEngineeringIndiaMar 37
ML Software Tool Development Engineer
ML Software Tool Development Engineer at Cerebras, focusing on building debugging, validation, and observability platforms for AI systems, including compilers, runtimes, and hardware interfaces. The role involves developing automated systems for anomaly detection, root-cause analysis, and visualization tools to support large-scale ML applications and inference.
ServeEngineeringUS and Canada OfficesFeb 177
Senior ML Software Engineer - Integration & Quality
Senior ML Software Engineer focused on integrating and validating the software stack for the Cerebras AI platform, ensuring reliable and efficient execution of large-scale ML workloads. This role involves debugging complex distributed systems, improving automation, and enhancing the reliability of AI infrastructure, working closely with runtime, compiler, kernel, and hardware teams.
ServeEngineeringHeadquarters +2Feb 57
Principal Engineer, AI Inference Reliability
Principal Engineer, AI Inference Reliability at Cerebras, focusing on ensuring the reliability, performance, and security of their large-scale AI inference services built on wafer-scale architecture. The role involves defining reliability strategy, implementing mechanisms for fault tolerance, leading incident management, and collaborating across engineering teams to meet world-class reliability standards.
ServeEngineeringHeadquarters +2 · RemoteOct '257
Site Reliability Engineer - Ops & Automation
Cerebras is seeking a Site Reliability Engineer to support their high-performance AI inference services powered by the Wafer-Scale Engine. The role involves operational execution, developing self-service CD pipelines, building automation tools, and enhancing observability for large-scale AI infrastructure. The position requires production Kubernetes experience and proficiency in Python or Go.
ServeEngineeringHeadquarters +2Oct '257
Staff Site Reliability Engineer – Automation and Platform
Staff Site Reliability Engineer focused on building and scaling high-performance SRE functions for Cerebras' AI inference services, powered by their Wafer-Scale Engine. The role involves leading engineering efforts to implement self-service delivery pipelines, shared observability tooling, and GitOps-driven CD for model releases and cluster management. The goal is to enable core teams, product managers, and external customers to operate in a fully self-service model with strong reliability guarantees, while also mentoring early-career SREs. The role emphasizes turning complexity into reliability at scale for frontier AI inference.
ServeEngineeringHeadquarters +2Oct '257
Principal Engineer, Inference Cloud
Principal Engineer for Cerebras' Inference Cloud Platform, focusing on availability, latency, reliability, and multi-region scale for their AI chip-based inference solution. This senior IC role involves defining long-term architecture, driving execution on critical paths, and contributing production code for large-scale distributed systems.
ServeEngineeringHeadquarters +2Sep '257
Performance Engineer
The role focuses on optimizing the performance of Cerebras' Runtime software driver, which runs on x86 machines and supports their AI accelerator chip. Responsibilities include CPU and memory subsystem optimizations, developing efficient data movement algorithms, utilizing advanced CPU features, performance profiling, and influencing future hardware/software designs. The role requires strong C/C++ skills and experience in performance engineering and system-level tuning.
ServeEngineeringToronto, ONSep '257
Staff Software Engineer, Inference Cloud
Staff Software Engineer role focused on building and operating the Inference Cloud Platform, responsible for availability, latency, reliability, and global scale of AI inference workloads. Requires deep expertise in distributed systems, high-QPS optimization, and experience with ML inference infrastructure.
ServeEngineeringHeadquarters +2Jul '247
AI Infrastructure Operations Engineer
The AI Infrastructure Operations Engineer will manage and operate Cerebras' advanced AI compute clusters, ensuring their health, performance, and availability. This role focuses on maximizing compute capacity, deploying container-based services, and providing 24/7 monitoring and support for large-scale machine learning infrastructure.
ServeEngineeringHeadquarters +2Mar '247