What AI roles is Cerebras hiring for?

Cerebras currently has 39 active AI-related roles in our index. The most common open titles are: Kernel Engineer (2), ML Systems Performance Engineer (2), LLM Inference Performance & Evals Engineer, AI Infrastructure Operations Engineer, AI Models, Product Manager. Most positions are in Engineering and Research.

What stage of AI development does Cerebras focus on?

Cerebras's active AI hiring is concentrated in: serving infrastructure (85%), post-training (8%), pre-training (5%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.

Where is Cerebras hiring AI talent?

Cerebras is hiring AI talent in: United States (23 roles), Canada (20 roles), India (6 roles), United Arab Emirates (3 roles).

What technologies does Cerebras's AI team work with?

Job postings at Cerebras most frequently reference: model serving, inference infra, fine tuning, llm observability, frontier research.

How many AI roles has Cerebras posted recently?

In the past 30 days, Cerebras has posted 4 new AI-related roles.

Cerebras — AI hiring signals

Cerebras currently has 38 active AI-related job listings. The majority of these roles, 79%, are focused on serving infrastructure. The top hiring function is Engineering, with 32 roles. The company is actively hiring in the United States and Canada. Frequent tech tags include model_serving and inference_infra. In the last 30 days, Cerebras posted 4 new AI roles, representing a 20% decrease compared to the previous 30-day period.

Auto-generated from active job postings · last refreshed 2026-05-24

Currently tracking 36 active AI roles, up 46% versus the prior 4 weeks. Primary focus: Serve · Engineering. Salary range $170k–$250k (avg $206k).

Hiring

36 / 36

Momentum (4w)

↑+6 +46%

19 opens last 4w · 13 prior 4w

Salary range · avg $206k

$170k–$250k

USD · disclosed roles only

Tracked since

Mar '24

last role 5w ago

Hiring velocityscroll left for older weeks

1 new role

Oct 23

1 new role

Mar 4

1 new role

Jul 8

1 new role

Mar 24

1 new role

Apr 7

1 new role

Jul 14

1 new role

Sep 8

1 new role

2 new roles

1 new role

Oct 6

1 new role

3 new roles

2 new roles

Nov 10

4 new roles

1 new role

Dec 8

1 new role

5 new roles

Jan 5

2 new roles

3 new roles

2 new roles

3 new roles

Feb 2

3 new roles

8 new roles

5 new roles

7 new roles

Mar 2

1 new role

2 new roles

4 new roles

6 new roles

Apr 6

5 new roles

4 new roles

6 new roles

May 4

2 new roles

1 new role

4 new roles

9 new roles

Jun 1

2 new roles

4 new roles

Jobs (115)

38 AI · 98 total active

Title	Stage	Function	Location	First seen	AI score
Advanced Technology: AI/ML Research Scientist Research Scientist role focused on designing AI models and training methods from first principles, leveraging novel wafer-scale hardware architectures. The role involves investigating computational science techniques for AI, understanding hardware-algorithm interactions, and publishing research at top-tier venues. The work directly influences future hardware and software design.	Pretrain	Research	Headquarters +3	Apr 6	10
Lead Full Stack Machine Learning Engineer This role focuses on bringing up and optimizing open-source AI models and frameworks on Cerebras' wafer-scale hardware. It involves working across the full software stack, from model translation and compiler optimizations to runtime integration and performance tuning, with a strong emphasis on debugging and improving the bring-up process for future models.	ServePost-train	Engineering	India	2w ago	9
ML Research Engineer (Inference) Research Engineer focused on adapting and optimizing advanced language and vision models for efficient inference on Cerebras' wafer-scale AI architecture. The role involves implementing, validating, and optimizing models for low-latency, high-throughput inference, with a focus on techniques like speculative decoding, pruning, compression, and sparsity.	Serve	Research	India	Apr 8	9
Advanced Technology: R&D Engineer - AI/ML, HPC Research Engineer role focused on designing and implementing AI/ML workloads on Cerebras' wafer-scale hardware, optimizing performance, and contributing to future hardware/software roadmaps. Involves algorithm-hardware co-design, performance modeling, and publishing research.	Serve	Research	Headquarters +3	Apr 6	9
Applied Machine Learning Research Scientist This role focuses on applying and scaling modern machine learning techniques, particularly LLM post-training (RLHF, GRPO), on Cerebras' wafer-scale AI chip. The scientist will build and maintain training pipelines, evaluation frameworks, and optimize ML workflows across pretraining, fine-tuning, and alignment stages, working with large datasets and contributing to shared ML infrastructure.	Post-trainData	Engineering	Headquarters +2	Mar 5	9
Kernel Engineer The Kernel Engineer will develop high-performance software solutions for AI and HPC workloads, focusing on implementing, optimizing, and scaling deep learning operations on Cerebras' custom hardware. This involves designing, developing, and debugging low-level kernels and algorithms to maximize compute utilization and training efficiency, while also studying emerging ML trends and interacting with hardware architects.	ServePost-train	Engineering	Headquarters +2	Feb 23	9
Senior ML Systems Engineer Senior ML Systems Engineer to join the SOTA Training Platform team, responsible for bringing up state-of-the-art open-source and proprietary ML models on Cerebras CSX systems. This role involves working across the full stack, including model architecture translation, graph lowering, compiler optimizations, runtime integration, and performance tuning, with a focus on debugging and improving the bring-up process.	Post-trainServe	Engineering	US and Canada Offices	Feb 12	9
Applied AI/ML Scientist Applied AI Scientist role focused on developing and customizing large language and deep learning models for customer problems using Cerebras' wafer-scale engine. Responsibilities include customer use case discovery, architecting and executing end-to-end training recipes, fine-tuning models, building agentic system components, and providing technical customer leadership. Requires strong expertise in deep learning, large model training/fine-tuning, Python, PyTorch, and distributed training.	Post-trainAgent	Engineering	United Arab Emirates	Jan 14	9
Principal ML Investigator Cerebras is seeking a Principal ML Investigator to lead a new ML team focused on advanced development in areas like post-training, reinforcement learning, dataset curation, LLM pretraining, sparsity, and various domains (coding agents, reasoning agents, generative language, image, video). The role involves building the team, formulating research agendas, adapting algorithms to Cerebras hardware, training/tuning/evaluating models, and collaborating with internal and external partners.	PretrainPost-train	Research	Headquarters +1	Dec '25	9
Staff Inference ML Runtime Engineer Staff Inference ML Runtime Engineer at Cerebras Systems, focusing on optimizing and scaling their wafer-scale AI chip for high-throughput, low-latency generative AI inference. The role involves designing and implementing ML features, APIs, and distributed runtime solutions, working with state-of-the-art generative AI models and multimodal data.	Serve	Engineering	Headquarters +2	Nov '25	9
Senior Runtime Engineer Senior Runtime Engineer role at Cerebras, focusing on designing and developing high-performance distributed software for large-scale AI training and inference workloads on their wafer-scale architecture. The role involves optimizing compute and data pipelines, ensuring scalability, and collaborating with ML and compiler teams. Requires strong C++ and distributed systems experience, with familiarity in ML pipelines preferred.	ServeAgent	Engineering	Headquarters +2	Oct '25	9
Kernel Engineer Kernel Engineer role focused on developing and optimizing high-performance software for Cerebras' AI chip, specifically implementing and scaling deep learning operations and building parallel algorithms for training and inference. The role involves low-level programming, performance tuning, and interaction with hardware architects to maximize compute utilization and accelerate AI innovation.	ServePretrain	Engineering	India	Oct '25	9
LLM Inference Performance & Evals Engineer Cerebras is seeking an LLM Inference Performance & Evals Engineer to optimize and validate state-of-the-art models on their wafer-scale AI hardware. The role involves prototyping architectural tweaks, building performance-evaluation pipelines, and collaborating with hardware and software teams to accelerate new model ideas and improve inference speeds.	ServeEval Gate	Engineering	Toronto, ON	Jul '25	9
Full Stack LLM Engineer Cerebras is seeking a Full Stack LLM Engineer to join their Inference Core Model Bringup team. This role involves bringing up state-of-the-art open-source and proprietary models on Cerebras CSX systems, working across the entire software stack from model translation and compiler optimizations to runtime integration and performance tuning. The engineer will debug performance and correctness issues and propose improvements to tools and automation. Experience with deep learning frameworks, model internals, C/C++, and compiler development (LLVM/MLIR) is required.	Serve	Engineering	Toronto, ON	Jul '25	9
ML Systems Performance Engineer ML Systems Performance Engineer role focused on optimizing inference speed and throughput on Cerebras' custom wafer-scale AI chip. Responsibilities include building performance models, optimizing kernel microcode and compiler algorithms, debugging runtime performance, and developing performance visualization tools. Requires strong background in computer architecture, low-level deep learning math, and experience with performance profiling and optimization on CPU/GPU simulators.	Serve	Engineering	India	1w ago	8
Senior Performance Engineer, Inference Senior Performance Engineer focused on benchmarking Cerebras' AI inference performance against competitors and analyzing pricing models. Requires deep expertise in open-source inference stacks, GPU optimization, and LLM inference economics.	Serve	Engineering	Headquarters +1	Apr 13	8
Engineering Manager, Inference ML Runtime Engineering Manager for Inference ML Runtime at Cerebras, leading a team to design and scale systems for executing state-of-the-art AI models on Cerebras hardware. The role focuses on ML, distributed systems, and high-performance runtime engineering, with a goal of delivering the fastest Generative AI inference solution.	Serve	Engineering	Headquarters +2	Mar 24	8
ML Performance Benchmarking Engineer ML Performance Benchmarking Engineer role focused on optimizing AI inference performance on Cerebras' wafer-scale architecture. Responsibilities include building observability and benchmarking infrastructure, performance analysis, and integrating new inference features. Requires strong Python/C++ and infrastructure scaling experience, with a focus on complex, large-scale systems.	Serve	Engineering	Toronto, ON	Mar 18	8
New Grad - ML Stack Optimization Engineer New Grad ML Stack Optimization Engineer role at Cerebras, focusing on optimizing compiler technologies for AI chips using LLVM and MLIR frameworks to enhance performance and efficiency of AI applications on their wafer-scale architecture.	Serve	Engineering	Headquarters +2	Feb 5	8
Staff Kernel Optimzation Engineer Staff Kernel Optimization Engineer role focused on developing and optimizing high-performance software for Cerebras' custom wafer-scale AI chip, specifically for deep learning operations and inference. This involves implementing and debugging low-level kernels, mapping algorithms to hardware, and studying emerging AI trends to evolve kernel library architecture. The role contributes to accelerating AI innovation and delivering industry-leading training and inference speeds.	ServePretrain	Engineering	Office, United Arab Emirates · Remote	Feb 5	8
ML Systems Performance Engineer ML Systems Performance Engineer at Cerebras, focusing on optimizing end-to-end model inference speed and throughput on their wafer-scale AI chip. Responsibilities include kernel optimization, system performance analysis, and developing performance modeling and diagnostic tools.	Serve	Engineering	Headquarters +2	Jan 21	8
AI Models, Product Manager Product Manager for AI Models at Cerebras, focusing on defining and launching the strategic model portfolio for their wafer-scale AI inference platform. Responsibilities include roadmap ownership, partnerships with AI labs and open-source communities, defining quality standards, leading go-to-market strategies, and making technical decisions on performance optimizations. The role requires strong product management experience, technical knowledge of AI models and inference, and cross-functional leadership.	ShipServe	Product	Headquarters +1	Jan 15	8
Performance & Reliability Engineer The Performance & Reliability Engineer will characterize and optimize the performance and reliability of advanced ML hardware/software systems, focusing on reducing power and thermal fluctuations. This role involves analyzing ML workloads, software kernels, and hardware architecture, developing software solutions for reliability and performance, and influencing next-generation AI architecture design.	Serve	Engineering	Headquarters +1	Nov '25	8
Staff Python / PyTorch Developer — Frontend Inference Compiler – Dubai Staff Python/PyTorch Developer for Frontend Inference Compiler at Cerebras, focusing on optimizing generative AI models for their wafer-scale AI chip. Responsibilities include developing compiler infrastructure, analyzing new models, and improving inference performance.	Serve	Engineering	United Arab Emirates	Oct '25	8
Product Manager, Strategic Verticals Product Manager for Strategic Verticals at Cerebras, focusing on embedding with strategic customers to translate their ambitions into AI solutions using Cerebras' wafer-scale architecture. The role involves owning customer success, designing PoCs, navigating complexities, and influencing the product roadmap by distilling customer insights.	ShipServe	Product	Headquarters +1	Sep '25	8
Software Engineer, Inference Platform Software Engineer for Cerebras' Inference Platform team, focusing on the orchestration layer for inference on datacenter clusters. Responsibilities include shaping platform direction, ensuring reliability and performance of active-active systems, writing production code, leading production issues, and partnering with ML/Product/Infra teams. Requires 3+ years of experience in distributed systems, Kubernetes, and building highly available, latency-sensitive systems. Experience with ML inference infrastructure is a plus.	ServeAgent	Engineering	Headquarters +1	1w ago	7
Staff Software Engineer, Inference Platform Staff Software Engineer for Cerebras' Inference Platform team, focusing on the orchestration layer for datacenter clusters. Responsibilities include platform direction, reliability, performance, execution on critical paths, production leadership, and technical influence. Requires 8+ years of experience in distributed systems, Kubernetes, and backend languages, with a plus for ML inference infrastructure experience.	ServeAgent	Engineering	Headquarters +1	1w ago	7
Member of Technical Staff (Software Engineer) Software Engineer to implement and optimize high-performance, low-latency inference services on Cerebras' wafer-scale AI chip, focusing on Kubernetes deployment, resource management, and reliability. This role involves collaborating with ML engineers, debugging complex issues, and ensuring the scalability and fault tolerance of AI inference workloads.	Serve	Engineering	Headquarters +1	7w ago	7
Sr. Member of Technical Staff This role focuses on developing and maintaining cloud-based deployment workflows for AI inference software, utilizing containerization and orchestration technologies like Docker and Kubernetes. The responsibilities include ensuring system resiliency, high availability, and optimizing performance for low-latency inference tasks. The role also involves debugging, monitoring, and documenting inference services, with a strong emphasis on infrastructure-as-code and CI/CD practices.	Serve	Engineering	Headquarters +1	7w ago	7
Advanced Technology: Compiler Engineer Cerebras is seeking a Compiler Engineer to work on their Tungsten language compiler, which is purpose-built for their wafer-scale AI hardware. The role involves designing and implementing compiler passes, co-designing language constructs, and developing code generation strategies for AI and scientific workloads. The engineer will collaborate with ASIC, kernel, and AI teams, and contribute to the broader toolchain including runtime and debuggers. Experience with novel architectures and ML compiler frameworks is valuable.	Serve	Engineering	Headquarters +2	Mar 30	7
QA Lead (ML Integration and Quality) The QA Lead will be responsible for ensuring the quality of Cerebras' software across all supported ML workloads and workflows, focusing on feature testing, ML training accuracy and performance, and pre-deployment validation. This role involves driving quality, implementing testing methodologies, automating workflows, and debugging issues within a large-scale enterprise environment.	ServePost-train	Engineering	India	Mar 3	7
ML Software Tool Development Engineer ML Software Tool Development Engineer at Cerebras, focusing on building debugging, validation, and observability platforms for AI systems, including compilers, runtimes, and hardware interfaces. The role involves developing automated systems for anomaly detection, root-cause analysis, and visualization tools to support large-scale ML applications and inference.	Serve	Engineering	US and Canada Offices	Feb 17	7
Senior ML Software Engineer - Integration & Quality Senior ML Software Engineer focused on integrating and validating the software stack for the Cerebras AI platform, ensuring reliable and efficient execution of large-scale ML workloads. This role involves debugging complex distributed systems, improving automation, and enhancing the reliability of AI infrastructure, working closely with runtime, compiler, kernel, and hardware teams.	Serve	Engineering	Headquarters +2	Feb 5	7
Principal Engineer, AI Inference Reliability Principal Engineer, AI Inference Reliability at Cerebras, focusing on ensuring the reliability, performance, and security of their large-scale AI inference services built on wafer-scale architecture. The role involves defining reliability strategy, implementing mechanisms for fault tolerance, leading incident management, and collaborating across engineering teams to meet world-class reliability standards.	Serve	Engineering	Headquarters +2 · Remote	Oct '25	7
Site Reliability Engineer - Ops & Automation Cerebras is seeking a Site Reliability Engineer to support their high-performance AI inference services powered by the Wafer-Scale Engine. The role involves operational execution, developing self-service CD pipelines, building automation tools, and enhancing observability for large-scale AI infrastructure. The position requires production Kubernetes experience and proficiency in Python or Go.	Serve	Engineering	Headquarters +2	Oct '25	7
Staff Site Reliability Engineer – Automation and Platform Staff Site Reliability Engineer focused on building and scaling high-performance SRE functions for Cerebras' AI inference services, powered by their Wafer-Scale Engine. The role involves leading engineering efforts to implement self-service delivery pipelines, shared observability tooling, and GitOps-driven CD for model releases and cluster management. The goal is to enable core teams, product managers, and external customers to operate in a fully self-service model with strong reliability guarantees, while also mentoring early-career SREs. The role emphasizes turning complexity into reliability at scale for frontier AI inference.	Serve	Engineering	Headquarters +2	Oct '25	7
Principal Engineer, Inference Cloud Principal Engineer for Cerebras' Inference Cloud Platform, focusing on availability, latency, reliability, and multi-region scale for their AI chip-based inference solution. This senior IC role involves defining long-term architecture, driving execution on critical paths, and contributing production code for large-scale distributed systems.	Serve	Engineering	Headquarters +2	Sep '25	7
Performance Engineer The role focuses on optimizing the performance of Cerebras' Runtime software driver, which runs on x86 machines and supports their AI accelerator chip. Responsibilities include CPU and memory subsystem optimizations, developing efficient data movement algorithms, utilizing advanced CPU features, performance profiling, and influencing future hardware/software designs. The role requires strong C/C++ skills and experience in performance engineering and system-level tuning.	Serve	Engineering	Toronto, ON	Sep '25	7
Staff Software Engineer, Inference Cloud Staff Software Engineer role focused on building and operating the Inference Cloud Platform, responsible for availability, latency, reliability, and global scale of AI inference workloads. Requires deep expertise in distributed systems, high-QPS optimization, and experience with ML inference infrastructure.	Serve	Engineering	Headquarters +2	Jul '24	7
AI Infrastructure Operations Engineer The AI Infrastructure Operations Engineer will manage and operate Cerebras' advanced AI compute clusters, ensuring their health, performance, and availability. This role focuses on maximizing compute capacity, deploying container-based services, and providing 24/7 monitoring and support for large-scale machine learning infrastructure.	Serve	Engineering	Headquarters +2	Mar '24	7
Physical Design Engineer Cerebras Systems is seeking a Physical Design Engineer to work on the design and analysis of 3D integrated products, focusing on ASIC/SoC physical design, packaging, power, clock, and cooling analysis. The role involves R&D on novel concepts for 3D integration and requires extensive experience in physical design flows, verification methodologies, and optimization for power/performance/area. The company builds large AI chips and provides AI compute power for training and inference.	—	Engineering	Headquarters +1	2w ago	5
Sr. Staff/Staff Design Verification Engineer The Sr. Staff/Staff Design Verification Engineer at Cerebras will be responsible for ensuring the high-quality design of Cerebras' AI chips, which are designed for AI training and inference. This role involves developing verification strategies, creating reusable verification environments, implementing tests, managing regressions, and debugging complex issues across simulation, emulation, and silicon bring-up. The engineer will collaborate with cross-functional teams including architecture, RTL design, physical design, firmware, and validation to ensure first-time silicon success. The role requires deep knowledge of SystemVerilog testbench, UVM, and scripting languages like Python, with a strong emphasis on debugging and problem-solving skills.	—	Engineering	Headquarters +1	3w ago	5
ASIC Architect This role is for an ASIC Architect at Cerebras, a company that builds large AI chips. The architect will translate high-level architecture specifications into micro-architecture requirements, perform performance and power trade-offs, and identify hardware acceleration opportunities for AI workloads. While the company's product is AI-focused and used for AI training and inference, the role itself is in hardware architecture and performance modeling, not direct AI/ML model development.	—	Engineering	Headquarters +1	3w ago	5
Network Architect The Network Architect will design and architect front-end datacenter and interconnect fabrics for AI clusters, optimizing for high resource utilization, low latency, and high-throughput communication. This role involves building proof-of-concept implementations, automating deployment and configuration, and establishing SRE-grade telemetry and observability for network reliability. Responsibilities include leading network debugging in distributed systems, collaborating with vendors, and representing the company in industry forums.	—	Engineering	Headquarters +1	3w ago	5
Senior / Staff Technical Program Manager - Datacenter Capacity Delivery (E2E) This role is for a Senior/Staff Technical Program Manager responsible for the end-to-end delivery of data center capacity for AI workloads. The role involves managing the entire lifecycle from planning to operational readiness, orchestrating cross-functional teams, and ensuring alignment with AI infrastructure and hardware deployment schedules. While the company builds AI hardware and the role supports AI workloads, the core function is data center capacity delivery, not direct AI/ML model development or research.	—	Engineering	Headquarters +1	3w ago	5
Security & IT General Opportunities The IT & Security team at Cerebras, which builds a large AI chip, is looking for individuals to secure and scale enterprise IT, cloud, network, and infrastructure environments. This involves building automation, supporting security engineering, improving security practices, and developing processes for a rapidly growing organization supporting advanced AI workloads.	—	Engineering	US and Canada Offices	4w ago	5
Software Development Engineer in Test (Cloud) Software Development Engineer in Test (Cloud) for Cerebras, focusing on quality ownership and building scalable test infrastructure for their AI Inference Cloud platform, which utilizes their large-scale AI chip for training and inference.	Serve	Engineering	India	7w ago	5
Sr. Technical Staff This role focuses on post-silicon validation, testing, and debugging of Cerebras' AI chips, specifically their Wafer Scale Engines. Responsibilities include characterizing high-speed interfaces, supporting manufacturing operations, developing automated regression test scripts, and creating debug tools. The role requires a Master's degree and experience in hardware bring-up, debug, and high-speed interfaces.	—	Engineering	Headquarters +1	7w ago	5
Physical Design Engineer Cerebras Systems is seeking a Physical Design Engineer to work on their AI chip. The role involves synthesis, place and route, timing closure, and verification of their wafer-scale design. The company builds the world's largest AI chip, providing significant compute power for AI training and inference.	—	Engineering	India	7w ago	5
Prognostics & Health Monitoring Engineer This role focuses on building a prognostics and health monitoring (PHM) capability for Cerebras' AI hardware and systems. The engineer will develop frameworks to monitor, assess, and predict hardware health, transforming telemetry data into actionable insights for early detection of degradation and proactive failure prediction to ensure system availability and performance. It involves reliability engineering, data science, and system software integration.	Ship	Engineering	Headquarters +1	8w ago	5

Frequently asked questions

What AI roles is Cerebras hiring for?
Cerebras currently has 39 active AI-related roles in our index. The most common open titles are: Kernel Engineer (2), ML Systems Performance Engineer (2), LLM Inference Performance & Evals Engineer, AI Infrastructure Operations Engineer, AI Models, Product Manager. Most positions are in Engineering and Research.
What stage of AI development does Cerebras focus on?
Cerebras's active AI hiring is concentrated in: serving infrastructure (85%), post-training (8%), pre-training (5%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
Where is Cerebras hiring AI talent?
Cerebras is hiring AI talent in: United States (23 roles), Canada (20 roles), India (6 roles), United Arab Emirates (3 roles).
What technologies does Cerebras's AI team work with?
Job postings at Cerebras most frequently reference: model serving, inference infra, fine tuning, llm observability, frontier research.
How many AI roles has Cerebras posted recently?
In the past 30 days, Cerebras has posted 4 new AI-related roles.

Title

Stage

Function

Location

First seen

AI score

Advanced Technology: AI/ML Research Scientist

Research Scientist role focused on designing AI models and training methods from first principles, leveraging novel wafer-scale hardware architectures. The role involves investigating computational science techniques for AI, understanding hardware-algorithm interactions, and publishing research at top-tier venues. The work directly influences future hardware and software design.

Pretrain

Research

Headquarters +3

Apr 6

Lead Full Stack Machine Learning Engineer

This role focuses on bringing up and optimizing open-source AI models and frameworks on Cerebras' wafer-scale hardware. It involves working across the full software stack, from model translation and compiler optimizations to runtime integration and performance tuning, with a strong emphasis on debugging and improving the bring-up process for future models.

ServePost-train

Engineering

India

2w ago

ML Research Engineer (Inference)

Research Engineer focused on adapting and optimizing advanced language and vision models for efficient inference on Cerebras' wafer-scale AI architecture. The role involves implementing, validating, and optimizing models for low-latency, high-throughput inference, with a focus on techniques like speculative decoding, pruning, compression, and sparsity.

Serve

Research

India

Apr 8

Advanced Technology: R&D Engineer - AI/ML, HPC

Research Engineer role focused on designing and implementing AI/ML workloads on Cerebras' wafer-scale hardware, optimizing performance, and contributing to future hardware/software roadmaps. Involves algorithm-hardware co-design, performance modeling, and publishing research.

Serve

Research

Headquarters +3

Apr 6

Applied Machine Learning Research Scientist

This role focuses on applying and scaling modern machine learning techniques, particularly LLM post-training (RLHF, GRPO), on Cerebras' wafer-scale AI chip. The scientist will build and maintain training pipelines, evaluation frameworks, and optimize ML workflows across pretraining, fine-tuning, and alignment stages, working with large datasets and contributing to shared ML infrastructure.

Post-trainData

Engineering

Headquarters +2

Mar 5

Kernel Engineer

The Kernel Engineer will develop high-performance software solutions for AI and HPC workloads, focusing on implementing, optimizing, and scaling deep learning operations on Cerebras' custom hardware. This involves designing, developing, and debugging low-level kernels and algorithms to maximize compute utilization and training efficiency, while also studying emerging ML trends and interacting with hardware architects.

ServePost-train

Engineering

Headquarters +2

Feb 23

Senior ML Systems Engineer

Senior ML Systems Engineer to join the SOTA Training Platform team, responsible for bringing up state-of-the-art open-source and proprietary ML models on Cerebras CSX systems. This role involves working across the full stack, including model architecture translation, graph lowering, compiler optimizations, runtime integration, and performance tuning, with a focus on debugging and improving the bring-up process.

Post-trainServe

Engineering

US and Canada Offices

Feb 12

Applied AI/ML Scientist

Applied AI Scientist role focused on developing and customizing large language and deep learning models for customer problems using Cerebras' wafer-scale engine. Responsibilities include customer use case discovery, architecting and executing end-to-end training recipes, fine-tuning models, building agentic system components, and providing technical customer leadership. Requires strong expertise in deep learning, large model training/fine-tuning, Python, PyTorch, and distributed training.

Post-trainAgent

Engineering

United Arab Emirates

Jan 14

Principal ML Investigator

Cerebras is seeking a Principal ML Investigator to lead a new ML team focused on advanced development in areas like post-training, reinforcement learning, dataset curation, LLM pretraining, sparsity, and various domains (coding agents, reasoning agents, generative language, image, video). The role involves building the team, formulating research agendas, adapting algorithms to Cerebras hardware, training/tuning/evaluating models, and collaborating with internal and external partners.

PretrainPost-train

Research

Headquarters +1

Dec '25

Staff Inference ML Runtime Engineer

Staff Inference ML Runtime Engineer at Cerebras Systems, focusing on optimizing and scaling their wafer-scale AI chip for high-throughput, low-latency generative AI inference. The role involves designing and implementing ML features, APIs, and distributed runtime solutions, working with state-of-the-art generative AI models and multimodal data.

Serve

Engineering

Headquarters +2

Nov '25

Senior Runtime Engineer

Senior Runtime Engineer role at Cerebras, focusing on designing and developing high-performance distributed software for large-scale AI training and inference workloads on their wafer-scale architecture. The role involves optimizing compute and data pipelines, ensuring scalability, and collaborating with ML and compiler teams. Requires strong C++ and distributed systems experience, with familiarity in ML pipelines preferred.

ServeAgent

Engineering

Headquarters +2

Oct '25

Kernel Engineer

Kernel Engineer role focused on developing and optimizing high-performance software for Cerebras' AI chip, specifically implementing and scaling deep learning operations and building parallel algorithms for training and inference. The role involves low-level programming, performance tuning, and interaction with hardware architects to maximize compute utilization and accelerate AI innovation.

ServePretrain

Engineering

India

Oct '25

LLM Inference Performance & Evals Engineer

Cerebras is seeking an LLM Inference Performance & Evals Engineer to optimize and validate state-of-the-art models on their wafer-scale AI hardware. The role involves prototyping architectural tweaks, building performance-evaluation pipelines, and collaborating with hardware and software teams to accelerate new model ideas and improve inference speeds.

ServeEval Gate

Engineering

Toronto, ON

Jul '25

Full Stack LLM Engineer

Cerebras is seeking a Full Stack LLM Engineer to join their Inference Core Model Bringup team. This role involves bringing up state-of-the-art open-source and proprietary models on Cerebras CSX systems, working across the entire software stack from model translation and compiler optimizations to runtime integration and performance tuning. The engineer will debug performance and correctness issues and propose improvements to tools and automation. Experience with deep learning frameworks, model internals, C/C++, and compiler development (LLVM/MLIR) is required.

Serve

Engineering

Toronto, ON

Jul '25

ML Systems Performance Engineer

ML Systems Performance Engineer role focused on optimizing inference speed and throughput on Cerebras' custom wafer-scale AI chip. Responsibilities include building performance models, optimizing kernel microcode and compiler algorithms, debugging runtime performance, and developing performance visualization tools. Requires strong background in computer architecture, low-level deep learning math, and experience with performance profiling and optimization on CPU/GPU simulators.

Serve

Engineering

India

1w ago

Senior Performance Engineer, Inference

Senior Performance Engineer focused on benchmarking Cerebras' AI inference performance against competitors and analyzing pricing models. Requires deep expertise in open-source inference stacks, GPU optimization, and LLM inference economics.

Serve

Engineering

Headquarters +1

Apr 13

Engineering Manager, Inference ML Runtime

Engineering Manager for Inference ML Runtime at Cerebras, leading a team to design and scale systems for executing state-of-the-art AI models on Cerebras hardware. The role focuses on ML, distributed systems, and high-performance runtime engineering, with a goal of delivering the fastest Generative AI inference solution.

Serve

Engineering

Headquarters +2

Mar 24

ML Performance Benchmarking Engineer

ML Performance Benchmarking Engineer role focused on optimizing AI inference performance on Cerebras' wafer-scale architecture. Responsibilities include building observability and benchmarking infrastructure, performance analysis, and integrating new inference features. Requires strong Python/C++ and infrastructure scaling experience, with a focus on complex, large-scale systems.

Serve

Engineering

Toronto, ON

Mar 18

New Grad - ML Stack Optimization Engineer

New Grad ML Stack Optimization Engineer role at Cerebras, focusing on optimizing compiler technologies for AI chips using LLVM and MLIR frameworks to enhance performance and efficiency of AI applications on their wafer-scale architecture.

Serve

Engineering

Headquarters +2

Feb 5

Staff Kernel Optimzation Engineer

Staff Kernel Optimization Engineer role focused on developing and optimizing high-performance software for Cerebras' custom wafer-scale AI chip, specifically for deep learning operations and inference. This involves implementing and debugging low-level kernels, mapping algorithms to hardware, and studying emerging AI trends to evolve kernel library architecture. The role contributes to accelerating AI innovation and delivering industry-leading training and inference speeds.

ServePretrain

Engineering

Office, United Arab Emirates · Remote

Feb 5

ML Systems Performance Engineer

ML Systems Performance Engineer at Cerebras, focusing on optimizing end-to-end model inference speed and throughput on their wafer-scale AI chip. Responsibilities include kernel optimization, system performance analysis, and developing performance modeling and diagnostic tools.

Serve

Engineering

Headquarters +2

Jan 21

AI Models, Product Manager

Product Manager for AI Models at Cerebras, focusing on defining and launching the strategic model portfolio for their wafer-scale AI inference platform. Responsibilities include roadmap ownership, partnerships with AI labs and open-source communities, defining quality standards, leading go-to-market strategies, and making technical decisions on performance optimizations. The role requires strong product management experience, technical knowledge of AI models and inference, and cross-functional leadership.

ShipServe

Product

Headquarters +1

Jan 15

Performance & Reliability Engineer

The Performance & Reliability Engineer will characterize and optimize the performance and reliability of advanced ML hardware/software systems, focusing on reducing power and thermal fluctuations. This role involves analyzing ML workloads, software kernels, and hardware architecture, developing software solutions for reliability and performance, and influencing next-generation AI architecture design.

Serve

Engineering

Headquarters +1

Nov '25

Staff Python / PyTorch Developer — Frontend Inference Compiler – Dubai

Staff Python/PyTorch Developer for Frontend Inference Compiler at Cerebras, focusing on optimizing generative AI models for their wafer-scale AI chip. Responsibilities include developing compiler infrastructure, analyzing new models, and improving inference performance.

Serve

Engineering

United Arab Emirates

Oct '25

Product Manager, Strategic Verticals

Product Manager for Strategic Verticals at Cerebras, focusing on embedding with strategic customers to translate their ambitions into AI solutions using Cerebras' wafer-scale architecture. The role involves owning customer success, designing PoCs, navigating complexities, and influencing the product roadmap by distilling customer insights.

ShipServe

Product

Headquarters +1

Sep '25

Software Engineer, Inference Platform

Software Engineer for Cerebras' Inference Platform team, focusing on the orchestration layer for inference on datacenter clusters. Responsibilities include shaping platform direction, ensuring reliability and performance of active-active systems, writing production code, leading production issues, and partnering with ML/Product/Infra teams. Requires 3+ years of experience in distributed systems, Kubernetes, and building highly available, latency-sensitive systems. Experience with ML inference infrastructure is a plus.

ServeAgent

Engineering

Headquarters +1

1w ago

Staff Software Engineer, Inference Platform

Staff Software Engineer for Cerebras' Inference Platform team, focusing on the orchestration layer for datacenter clusters. Responsibilities include platform direction, reliability, performance, execution on critical paths, production leadership, and technical influence. Requires 8+ years of experience in distributed systems, Kubernetes, and backend languages, with a plus for ML inference infrastructure experience.

ServeAgent

Engineering

Headquarters +1

1w ago

Member of Technical Staff (Software Engineer)

Software Engineer to implement and optimize high-performance, low-latency inference services on Cerebras' wafer-scale AI chip, focusing on Kubernetes deployment, resource management, and reliability. This role involves collaborating with ML engineers, debugging complex issues, and ensuring the scalability and fault tolerance of AI inference workloads.

Serve

Engineering

Headquarters +1

7w ago

Sr. Member of Technical Staff

This role focuses on developing and maintaining cloud-based deployment workflows for AI inference software, utilizing containerization and orchestration technologies like Docker and Kubernetes. The responsibilities include ensuring system resiliency, high availability, and optimizing performance for low-latency inference tasks. The role also involves debugging, monitoring, and documenting inference services, with a strong emphasis on infrastructure-as-code and CI/CD practices.

Serve

Engineering

Headquarters +1

7w ago

Advanced Technology: Compiler Engineer

Cerebras is seeking a Compiler Engineer to work on their Tungsten language compiler, which is purpose-built for their wafer-scale AI hardware. The role involves designing and implementing compiler passes, co-designing language constructs, and developing code generation strategies for AI and scientific workloads. The engineer will collaborate with ASIC, kernel, and AI teams, and contribute to the broader toolchain including runtime and debuggers. Experience with novel architectures and ML compiler frameworks is valuable.

Serve

Engineering

Headquarters +2

Mar 30

QA Lead (ML Integration and Quality)

The QA Lead will be responsible for ensuring the quality of Cerebras' software across all supported ML workloads and workflows, focusing on feature testing, ML training accuracy and performance, and pre-deployment validation. This role involves driving quality, implementing testing methodologies, automating workflows, and debugging issues within a large-scale enterprise environment.

ServePost-train

Engineering

India

Mar 3

ML Software Tool Development Engineer

ML Software Tool Development Engineer at Cerebras, focusing on building debugging, validation, and observability platforms for AI systems, including compilers, runtimes, and hardware interfaces. The role involves developing automated systems for anomaly detection, root-cause analysis, and visualization tools to support large-scale ML applications and inference.

Serve

Engineering

US and Canada Offices

Feb 17

Senior ML Software Engineer - Integration & Quality

Senior ML Software Engineer focused on integrating and validating the software stack for the Cerebras AI platform, ensuring reliable and efficient execution of large-scale ML workloads. This role involves debugging complex distributed systems, improving automation, and enhancing the reliability of AI infrastructure, working closely with runtime, compiler, kernel, and hardware teams.

Serve

Engineering

Headquarters +2

Feb 5

Principal Engineer, AI Inference Reliability

Principal Engineer, AI Inference Reliability at Cerebras, focusing on ensuring the reliability, performance, and security of their large-scale AI inference services built on wafer-scale architecture. The role involves defining reliability strategy, implementing mechanisms for fault tolerance, leading incident management, and collaborating across engineering teams to meet world-class reliability standards.

Serve

Engineering

Headquarters +2 · Remote

Oct '25

Site Reliability Engineer - Ops & Automation

Cerebras is seeking a Site Reliability Engineer to support their high-performance AI inference services powered by the Wafer-Scale Engine. The role involves operational execution, developing self-service CD pipelines, building automation tools, and enhancing observability for large-scale AI infrastructure. The position requires production Kubernetes experience and proficiency in Python or Go.

Serve

Engineering

Headquarters +2

Oct '25

Staff Site Reliability Engineer – Automation and Platform

Staff Site Reliability Engineer focused on building and scaling high-performance SRE functions for Cerebras' AI inference services, powered by their Wafer-Scale Engine. The role involves leading engineering efforts to implement self-service delivery pipelines, shared observability tooling, and GitOps-driven CD for model releases and cluster management. The goal is to enable core teams, product managers, and external customers to operate in a fully self-service model with strong reliability guarantees, while also mentoring early-career SREs. The role emphasizes turning complexity into reliability at scale for frontier AI inference.

Serve

Engineering

Headquarters +2

Oct '25

Principal Engineer, Inference Cloud

Principal Engineer for Cerebras' Inference Cloud Platform, focusing on availability, latency, reliability, and multi-region scale for their AI chip-based inference solution. This senior IC role involves defining long-term architecture, driving execution on critical paths, and contributing production code for large-scale distributed systems.

Serve

Engineering

Headquarters +2

Sep '25

Performance Engineer

The role focuses on optimizing the performance of Cerebras' Runtime software driver, which runs on x86 machines and supports their AI accelerator chip. Responsibilities include CPU and memory subsystem optimizations, developing efficient data movement algorithms, utilizing advanced CPU features, performance profiling, and influencing future hardware/software designs. The role requires strong C/C++ skills and experience in performance engineering and system-level tuning.

Serve

Engineering

Toronto, ON

Sep '25

Staff Software Engineer, Inference Cloud

Staff Software Engineer role focused on building and operating the Inference Cloud Platform, responsible for availability, latency, reliability, and global scale of AI inference workloads. Requires deep expertise in distributed systems, high-QPS optimization, and experience with ML inference infrastructure.

Serve

Engineering

Headquarters +2

Jul '24

AI Infrastructure Operations Engineer

The AI Infrastructure Operations Engineer will manage and operate Cerebras' advanced AI compute clusters, ensuring their health, performance, and availability. This role focuses on maximizing compute capacity, deploying container-based services, and providing 24/7 monitoring and support for large-scale machine learning infrastructure.

Serve

Engineering

Headquarters +2

Mar '24

Physical Design Engineer

Cerebras Systems is seeking a Physical Design Engineer to work on the design and analysis of 3D integrated products, focusing on ASIC/SoC physical design, packaging, power, clock, and cooling analysis. The role involves R&D on novel concepts for 3D integration and requires extensive experience in physical design flows, verification methodologies, and optimization for power/performance/area. The company builds large AI chips and provides AI compute power for training and inference.

—

Engineering

Headquarters +1

2w ago

Sr. Staff/Staff Design Verification Engineer

The Sr. Staff/Staff Design Verification Engineer at Cerebras will be responsible for ensuring the high-quality design of Cerebras' AI chips, which are designed for AI training and inference. This role involves developing verification strategies, creating reusable verification environments, implementing tests, managing regressions, and debugging complex issues across simulation, emulation, and silicon bring-up. The engineer will collaborate with cross-functional teams including architecture, RTL design, physical design, firmware, and validation to ensure first-time silicon success. The role requires deep knowledge of SystemVerilog testbench, UVM, and scripting languages like Python, with a strong emphasis on debugging and problem-solving skills.

—

Engineering

Headquarters +1

3w ago

ASIC Architect

This role is for an ASIC Architect at Cerebras, a company that builds large AI chips. The architect will translate high-level architecture specifications into micro-architecture requirements, perform performance and power trade-offs, and identify hardware acceleration opportunities for AI workloads. While the company's product is AI-focused and used for AI training and inference, the role itself is in hardware architecture and performance modeling, not direct AI/ML model development.

—

Engineering

Headquarters +1

3w ago

Network Architect

The Network Architect will design and architect front-end datacenter and interconnect fabrics for AI clusters, optimizing for high resource utilization, low latency, and high-throughput communication. This role involves building proof-of-concept implementations, automating deployment and configuration, and establishing SRE-grade telemetry and observability for network reliability. Responsibilities include leading network debugging in distributed systems, collaborating with vendors, and representing the company in industry forums.

—

Engineering

Headquarters +1

3w ago

Senior / Staff Technical Program Manager - Datacenter Capacity Delivery (E2E)

This role is for a Senior/Staff Technical Program Manager responsible for the end-to-end delivery of data center capacity for AI workloads. The role involves managing the entire lifecycle from planning to operational readiness, orchestrating cross-functional teams, and ensuring alignment with AI infrastructure and hardware deployment schedules. While the company builds AI hardware and the role supports AI workloads, the core function is data center capacity delivery, not direct AI/ML model development or research.

—

Engineering

Headquarters +1

3w ago

Security & IT General Opportunities

The IT & Security team at Cerebras, which builds a large AI chip, is looking for individuals to secure and scale enterprise IT, cloud, network, and infrastructure environments. This involves building automation, supporting security engineering, improving security practices, and developing processes for a rapidly growing organization supporting advanced AI workloads.

—

Engineering

US and Canada Offices

4w ago

Software Development Engineer in Test (Cloud)

Software Development Engineer in Test (Cloud) for Cerebras, focusing on quality ownership and building scalable test infrastructure for their AI Inference Cloud platform, which utilizes their large-scale AI chip for training and inference.

Serve

Engineering

India

7w ago

Sr. Technical Staff

This role focuses on post-silicon validation, testing, and debugging of Cerebras' AI chips, specifically their Wafer Scale Engines. Responsibilities include characterizing high-speed interfaces, supporting manufacturing operations, developing automated regression test scripts, and creating debug tools. The role requires a Master's degree and experience in hardware bring-up, debug, and high-speed interfaces.

—

Engineering

Headquarters +1

7w ago

Physical Design Engineer

Cerebras Systems is seeking a Physical Design Engineer to work on their AI chip. The role involves synthesis, place and route, timing closure, and verification of their wafer-scale design. The company builds the world's largest AI chip, providing significant compute power for AI training and inference.

—

Engineering

India

7w ago

Prognostics & Health Monitoring Engineer

This role focuses on building a prognostics and health monitoring (PHM) capability for Cerebras' AI hardware and systems. The engineer will develop frameworks to monitor, assess, and predict hardware health, transforming telemetry data into actionable insights for early detection of degradation and proactive failure prediction to ensure system availability and performance. It involves reliability engineering, data science, and system software integration.

Ship

Engineering

Headquarters +1

8w ago