Which companies are hiring for Inference infra roles?

The companies with the most active Inference infra listings are: Amazon (334 roles), NVIDIA (326 roles), Google (176 roles), Capital One (107 roles), Microsoft (106 roles).

What AI lifecycle stage does Inference infra belong to?

Inference infra primarily belongs to the serving infrastructure stage of the AI lifecycle. In current hiring, Inference infra roles concentrate at: serving infrastructure (59%), agents (25%).

What sectors invest most in Inference infra?

The sectors with the most active Inference infra hiring are: Big Tech, Semiconductors, Enterprise.

← Tag co-occurrence network

Inference infra

Lower-level systems work optimizing how trained models actually run on GPUs: scheduling, custom kernels, paged attention, speculative decoding.

Primary AI lifecycle stage: serving infrastructure.

As of today, 2,740 active AI roles across 208 companies in our index reference Inference infra. Hiring concentrates at the serving infrastructure (59%) and agents (25%) stages. Most common sectors: Big Tech, Semiconductors, Enterprise.

Top hiring:

Function

All Engineering · 3717 Research · 191 Product · 69

Status

All Active only

Sort

AI score Recently posted Company A–Z

FilteredsectorSemiconductors×

716 AI roles tagged inference_infra.

Company	Title	Sector	AI score	Other tags
NVIDIA	Research Scientist, Generalist Embodied Agent Research - PhD New College Grad 2026	Semiconductors	10	Embodied AI · Agent research · Multimodal · RL robotics · Model serving · Synthetic data
NVIDIA	Senior Systems Software Engineer, AI Stack and Performance - DGX Station	Semiconductors	9	Model serving · Agent orchestration
AMD	Fellow, AI Workload Optimization	Semiconductors	9	Model serving · Fine-tuning · Quantization · Multimodal · LLM observability
NVIDIA	Senior Machine Learning Engineer, Perception - Autonomous Driving	Semiconductors	9	Vision · Model serving
NVIDIA	Senior Software Engineer, DGX Cloud AI Infrastructure	Semiconductors	9	Model serving
NVIDIA	Software Engineer, DGX Cloud AI Infrastructure	Semiconductors	9	Model serving
NVIDIA	Senior Deep Learning Performance Architect	Semiconductors	9	Model serving
NVIDIA	AI Inference Performance Engineer - New College Grad 2026	Semiconductors	9	Model serving · Quantization · Vision · Audio & speech
NVIDIA	Deep Learning Performance Software Engineer	Semiconductors	9	Model serving
AMD	Senior Software Development Engineer – LLM Inference Framework	Semiconductors	9	Model serving
NVIDIA	Senior Data Scientist - Security and Networking Research	Semiconductors	9	Agent orchestration · RAG · Tool use · Evals · Fine-tuning · Model serving · Multimodal · Synthetic data
NVIDIA	AI Computing Architect	Semiconductors	9	Model serving
AMD	多模态算法工程师（模型优化方向）/ Multimodal Algorithm Engineer (Model Optimization)	Semiconductors	9	Multimodal · Embodied AI · Agent orchestration · Fine-tuning · Model serving · Quantization
NVIDIA	AI Workload and Networking Research Architect	Semiconductors	9	Model serving · Frontier research · Multimodal · LLM observability
NVIDIA	Senior High Performance AI Engineer	Semiconductors	9	Agent orchestration · Tool use · Code gen · Model serving · Agent research
AMD	Principal AI Performance Modeling Architect	Semiconductors	9	Model serving · Fine-tuning · Multimodal · Vision · Audio & speech
NVIDIA	Senior Quantum AI Research Scientist, Applied Research	Semiconductors	9	Frontier research · Agent research · RL post-training · Fine-tuning · Model serving
AMD	Senior Agentic System & Application Engineer	Semiconductors	9	Agent orchestration · Tool use · LLM observability · Model serving · Evals · Code gen
NVIDIA	Senior LLM Agents Architect	Semiconductors	9	Agent orchestration · Tool use · Evals · LLM observability · RAG · Model serving · Code gen
NVIDIA	Software Engineer, AI and DL Kernel Libraries - New College Grad 2026	Semiconductors	9	Model serving
NVIDIA	Senior Performance Architect, Nemotron	Semiconductors	9	Model serving
Intel	Neuromorphic Applications Researcher- Temporary Position	Semiconductors	9	Embodied AI · Model serving
NVIDIA	Senior Research Scientist, Post-Training LLM and DLM	Semiconductors	9	Fine-tuning · RL post-training · Model serving · Evals
NVIDIA	Senior Software Engineer, Agentic AI – Nvidia Blueprints and NIM Integrations	Semiconductors	9	Agent orchestration · Tool use · Evals · RAG · Model serving
NVIDIA	Senior DL Algorithms Engineer - Inference Performance	Semiconductors	9	Model serving
NVIDIA	Senior Deep Learning Software Engineer, Inference	Semiconductors	9	Model serving · Multimodal
NVIDIA	Senior GenAI Technical Lead, Partner Platforms	Semiconductors	9	Agent orchestration · Tool use · RAG · Vector DB · Model serving · Multimodal
NVIDIA	Senior Solutions Architect, Autonomous Driving - GenAI	Semiconductors	9	Agent orchestration · Synthetic data · Model serving · Vision · Multimodal
NVIDIA	Machine Learning Applications and Compiler Engineer, LPX - New College Grad 2026	Semiconductors	9	Model serving
NVIDIA	Senior AI Engineer, World Foundation Models	Semiconductors	9	Vision · Multimodal · Fine-tuning · Model serving · Evals · Frontier research

Frequently asked questions

What is Inference infra in AI?
Lower-level systems work optimizing how trained models actually run on GPUs: scheduling, custom kernels, paged attention, speculative decoding. Primary AI lifecycle stage: serving infrastructure.
How many AI roles reference Inference infra right now?
2,740 active AI roles across 208 companies in our index reference Inference infra as of today.
Which companies are hiring for Inference infra roles?
The companies with the most active Inference infra listings are: Amazon (334 roles), NVIDIA (326 roles), Google (176 roles), Capital One (107 roles), Microsoft (106 roles).
What AI lifecycle stage does Inference infra belong to?
Inference infra primarily belongs to the serving infrastructure stage of the AI lifecycle. In current hiring, Inference infra roles concentrate at: serving infrastructure (59%), agents (25%).
What sectors invest most in Inference infra?
The sectors with the most active Inference infra hiring are: Big Tech, Semiconductors, Enterprise.