AI Hire Signal
JobsCompaniesTrendsInsightsWeekly
JobsStrategy timeline
AI Hire Signal

Tracking AI hiring across 200+ US tech companies. Stage, salary, and stack signals on every role — refreshed weekly.

Contact

Browse

JobsCompaniesTrendsInsightsWeekly

Resources

AboutSitemapRobots

Legal

PrivacyTerms
© 2026 AI Hire Signal·Not affiliated with companies shown

Currently tracking 440 active AI roles, down 53% versus the prior 4 weeks. Primary focus: Serve · Engineering. Salary range $100k–$575k (avg $262k).

Hiring
440 / 623
Momentum (4w)
↓-386 -53%
340 opens last 4w · 726 prior 4w
Salary range · avg $262k
$100k–$575k
USD · disclosed roles only
Tracked since
May '25
last role 4w ago
Hiring velocityscroll left for older weeks
1 new role
Dec 30
1 new role
Mar 10
1 new role
24
1 new role
Apr 28
4 new roles
May 12
5 new roles
19
3 new roles
26
3 new roles
Jun 2
2 new roles
9
1 new role
16
2 new roles
23
3 new roles
30
4 new roles
Jul 7
1 new role
14
2 new roles
28
4 new roles
Aug 11
6 new roles
18
2 new roles
25
3 new roles
Sep 1
8 new roles
15
3 new roles
22
6 new roles
29
2 new roles
Oct 6
2 new roles
13
3 new roles
20
6 new roles
27
9 new roles
Nov 3
8 new roles
10
8 new roles
17
4 new roles
24
11 new roles
Dec 1
9 new roles
8
14 new roles
15
10 new roles
22
8 new roles
29
107 new roles
Jan 5
22 new roles
12
45 new roles
19
32 new roles
26
59 new roles
Feb 2
64 new roles
9
63 new roles
16
83 new roles
23
83 new roles
Mar 2
88 new roles
9
97 new roles
16
72 new roles
23
215 new roles
30
158 new roles
Apr 6
250 new roles
13
199 new roles
20
332 new roles
27
304 new roles
May 4
189 new roles
11
131 new roles
18
102 new roles
25
129 new roles
Jun 1
122 new roles
8
49 new roles
15
40 new roles
22

NVIDIA currently has 496 active AI-related job listings. The majority of these roles, 52%, are focused on serving infrastructure, with agents representing another significant segment at 23%. Engineering is the dominant function, with 441 positions. The United States leads hiring geographies with 287 roles, followed by China with 64. Frequent tech tags include model_serving, inference_infra, and agent_orchestration, suggesting a focus on deployment and management of AI models. Over the last 30 days, NVIDIA posted 214 new AI roles, a 27% decrease compared to the previous 30-day period.

Auto-generated from active job postings · last refreshed 2026-05-24

Frequently asked questions

  • What AI roles is NVIDIA hiring for?

    NVIDIA currently has 487 active AI-related roles in our index. The most common open titles are: Deep Learning Performance Architect (4), Senior Deep Learning Performance Architect (4), AI Research Scientist (3), Developer Technology Engineer - AI (3), Manager, Deep Learning Algorithms (3). Most positions are in Engineering and Research.

  • What stage of AI development does NVIDIA focus on?

    NVIDIA's active AI hiring is concentrated in: serving infrastructure (54%), agents (21%), application (8%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.

  • Where is NVIDIA hiring AI talent?

    NVIDIA is hiring AI talent in: United States (286 roles), China (59 roles), Israel (50 roles), Germany (21 roles).

  • What technologies does NVIDIA's AI team work with?

    Job postings at NVIDIA most frequently reference: model serving, inference infra, agent orchestration, llm observability, multimodal.

  • How many AI roles has NVIDIA posted recently?

    In the past 30 days, NVIDIA has posted 110 new AI-related roles. That is a -50% change versus the prior 30 days (218 → 110).

Jobs (197)

434 AI · 1824 total active
FilteredStageServe×CountryUnited States×Clear all
Show
Active onlyAI only (≥ 7)
Stage
AllData · 28Pretrain · 30Post-train · 51Serve · 356Agent · 192Eval Gate · 11Ship · 55
Function
AllEngineering · 627Research · 82Product · 14
Country
AllUnited States · 439China · 93Israel · 54Germany · 36Switzerland · 31India · 26United Kingdom · 24Poland · 17Vietnam · 13Canada · 12Singapore · 11France · 10Netherlands · 9Italy · 8Taiwan · 6Hong Kong · 4Japan · 4Spain · 3Australia · 2Czech Republic · 2Finland · 2Hungary · 2South Korea · 2Armenia · 1Brazil · 1Mexico · 1Romania · 1Saudi Arabia · 1Sweden · 1United Arab Emirates · 1
Sort
AI scoreRecentTitle
TitleStageFunctionLocationFirst seenAI score
Senior Systems Software Engineer, AI Stack and Performance - DGX Station
Senior Systems Software Engineer focused on optimizing AI stack performance and readiness on NVIDIA's DGX Station, a workstation-class AI computer. The role involves profiling, identifying bottlenecks, and driving optimizations across the full stack from GPU kernels to applications, ensuring AI workloads like LLM inference and agents run efficiently in multi-GPU, multi-user configurations. Collaboration with framework, compiler, and GPU architecture teams is critical.
ServeShipEngineeringSanta Clara, CA +1 · Remote3w ago9
Senior Software Engineer, DGX Cloud AI Infrastructure
Senior Software Engineer to lead the bring-up, triage, benchmarking, analysis, and optimization of distributed training and inference workloads across NVIDIA GPU platforms at scale. This role involves setting technical direction for communication libraries, model frameworks, and inference/training stacks, leading performance and reliability investigations, defining benchmarking and qualification processes, and building resilience capabilities for large clusters.
1–50 of 197← Prev1234Next →
ServePost-train
Engineering
Santa Clara, CA +4 · Remote
3w ago
9
Software Engineer, DGX Cloud AI Infrastructure
Software Engineer role focused on AI infrastructure, specifically distributed training and inference workloads on NVIDIA GPU platforms. Responsibilities include bring-up, triage, benchmarking, analysis, and optimization of these workloads at scale. Requires experience with multi-GPU/multi-node systems, debugging distributed environments, and strong Python/C++ skills.
ServePost-trainEngineeringSanta Clara, CA +4 · Remote3w ago9
Senior Deep Learning Performance Architect
NVIDIA is seeking a Senior Deep Learning Performance Architect to analyze and develop next-generation architectures for AI and HPC applications. The role involves developing innovative architectures, analyzing performance/cost/power trade-offs using models and simulators, understanding hardware/software interplay, and evaluating PPA for architectural decisions. Collaboration with software, product, and research teams is key. Requires MS/PhD, 6+ years experience, strong background in GPU/Deep Learning ASIC architecture for distributed training/inference, performance modeling, and ML/DL fundamentals, particularly transformer architectures. Proficiency in Python, C, C++ is essential.
ServeEngineeringSanta Clara, CA +13w ago9
AI Inference Performance Engineer - New College Grad 2026
NVIDIA is seeking an AI Inference Performance Engineer to optimize and benchmark GenAI inference on their accelerators, working with frameworks like TensorRT-LLM, SGLang, and vLLM. The role involves driving industry benchmark results, defining cutting-edge workloads, architecting distributed inference, establishing performance methodology, and influencing the ecosystem through open-source contributions and cross-functional partnerships. Requires strong programming skills, DL framework expertise, and a deep understanding of LLM inference mechanics.
ServeEngineeringSanta Clara, CA3w ago9
Senior Performance Architect, Nemotron
NVIDIA is seeking a Senior Performance Architect for Nemotron to focus on deep model-system-hardware co-design. The role involves developing high-fidelity performance models to evaluate architectural choices, predict deployment efficiency, and ensure Pareto-optimal trade-offs for future Nemotron models. This position will guide future software and hardware roadmaps by modeling end-to-end performance impact of GenAI workflows and collaborating with research, framework, compiler, and hardware teams.
ServeEngineeringSanta Clara, CA +25w ago9
Software Engineer, AI and DL Kernel Libraries - New College Grad 2026
Software Engineer role focused on developing AI systems software for efficient inference, including libraries, code generators, and GPU kernels for NVIDIA's hardware. The role involves designing abstractions, optimizing kernels, building LLM serving runtimes, and contributing to open-source projects like FlashInfer and vLLM.
ServeEngineeringSanta Clara, CA +1 · Remote5w ago9
Senior DL Algorithms Engineer - Inference Performance
Senior engineer to optimize LLM/Omni model inference performance on NVIDIA's accelerated inference software stack, working across hardware and software layers. Involves enabling and optimizing open models, contributing code to frameworks like TRT-LLM and vLLM, profiling bottlenecks, and benchmarking.
ServeEngineeringSanta Clara, CA +1 · Remote7w ago9
Senior Deep Learning Software Engineer, Inference
Senior Software Engineer specializing in Deep Learning Inference to optimize GPU-accelerated software for AI applications. Focus on high-performance deep learning frameworks like SGLang and vLLM for efficient model serving and inference, improving performance across NVIDIA accelerators.
ServeEngineeringSanta Clara, CA +1 · Remote7w ago9
Tech Engagement Lead - Model Builder
This role focuses on engaging with leading AI model builders to drive the adoption and optimize the performance of NVIDIA's hardware, systems, and software (e.g., GPUs, DGX, CUDA-X, NeMo, TensorRT) within their generative AI workflows, specifically for training and inference. The role involves technical integration, strengthening partnerships, influencing product roadmaps, and showcasing best practices for scalable AI model development pipelines.
ServePost-trainEngineeringSanta Clara, CA8w ago9
Senior Software Engineer, AI Inference Systems
Senior Software Engineer focused on building and optimizing AI inference systems, including vLLM, GPU kernels, and orchestration for large-scale model deployments. The role involves performance engineering, benchmarking (MLPerf), and potentially research integration.
ServeEngineeringSanta Clara, CA8w ago9
Senior Deep Learning Software Engineer
Senior Deep Learning Software Engineer to design and build an automated inference and deployment solution with a scalable architecture focusing on ease-of-use and compute efficiency. The role involves developing features in high-level frameworks, implementing a high-performance execution environment, and low-level GPU optimizations.
ServeEngineeringSanta Clara, CA +1Apr 249
Principal Architect, AI Networking
This role leads the research agenda and architectural direction for NVIDIA's AI networking systems, focusing on high-performance communication at scale. It involves original research, hardware-software co-optimization, and integrating networking into AI serving stacks, with a requirement to publish findings and ship production-grade software.
ServePretrainResearchSanta Clara, CA +4 · RemoteApr 239
Manager, Deep Learning – Autonomous Vehicles and Robotics
Manager for a Deep Learning Engineering team focused on delivering production-quality deep learning solutions for autonomous vehicles and robotics on edge hardware. The role involves leading a team, defining technical initiatives, and collaborating with automotive OEMs and robotics partners to optimize solutions on NVIDIA platforms, working at the intersection of model architectures, compiler technology, and embedded deployment.
ServePost-trainEngineeringSanta Clara, CAApr 229
Senior AI Software Engineer, Kernel Libraries
Senior AI Software Engineer focused on developing kernel libraries and inference systems software to accelerate AI workloads, including LLMs and agents, on NVIDIA's hardware. Responsibilities include innovating and optimizing kernels, designing abstractions for serving engines, and building compilers/runtimes.
ServeEngineeringSanta Clara, CA +1 · RemoteApr 229
Senior Software Engineer, AI and DL Kernel Libraries
Develops libraries, code generators, and GPU kernel technologies for NVIDIA's AI inference systems software stack, focusing on accelerating AI inference through efficient kernels, abstractions, and runtimes for LLMs and agents.
ServeEngineeringSanta Clara, CA +7 · RemoteApr 229
Senior AI Compiler Engineer, MLIR
NVIDIA is hiring a Senior AI Compiler Engineer to build an MLIR-based AI compiler for their inference engine, focusing on performance, low memory usage, and usability across data center and edge. The role involves developing graph representations, optimizations, defining APIs, and implementing compiler optimizations and kernel generation for neural networks.
ServeEngineeringSanta Clara, CA +5 · RemoteApr 229
Senior DL Software Engineer, Model Optimization and Edge Deployment - Autonomous Vehicles
Senior DL Software Engineer focused on optimizing and deploying large multimodal models (LLMs/VLMs) for real-time robotic execution in autonomous vehicles. The role involves advanced model compression, quantization, pruning, distillation, and inference optimization techniques for edge deployment on NVIDIA hardware, integrating with C++ production environments.
ServeAgentEngineeringSanta Clara, CAApr 219
Senior Deep Learning Software Engineer, LLM Performance
Senior Deep Learning Software Engineer focused on optimizing LLM inference performance on NVIDIA accelerators using frameworks like TensorRT LLM, VLLM, and Triton. The role involves implementing and scaling inference, serving, and deployment algorithms, collaborating with various teams, and contributing to NVIDIA/OSS LLM frameworks.
ServeEngineeringSanta Clara, CAApr 169
Senior Software Engineer - AI Inference
Senior Software Engineer focused on optimizing and contributing to open-source LLM inference serving engines like vLLM and SGLang to run efficiently on NVIDIA GPUs, focusing on high-throughput, low-latency inference at scale.
ServeEngineeringSanta Clara, CA +3 · RemoteApr 149
Solutions Architect, LLM Model Builder
Solutions Architect focused on enabling partners to build, benchmark, fine-tune, optimize, and deploy foundation model solutions for customer workloads, with an emphasis on reasoning, multimodal, and production inference.
ServePost-trainEngineeringSanta Clara, CAApr 79
Solutions Architect, LLM Model Builder
Solutions Architect focused on enabling partners to build, benchmark, fine-tune, optimize, and deploy foundation model solutions for customer workloads, with a strong emphasis on production inference and reasoning/multimodal models.
ServePost-trainEngineeringSanta Clara, CAApr 79
Senior Manager, Software Engineering - JAX
Senior Engineering Manager to define and drive NVIDIA's JAX strategy, coordinating multiple teams to ensure JAX delivers peak performance across heterogeneous hardware (GPUs, CPUs, LPUs). The role involves supporting emerging needs across training, post-training, inference, and robotics, bridging new hardware capabilities with AI trends. Key responsibilities include driving engineering contribution strategy, promoting teamwork, building partnerships with open-source projects, designing processes, and leading a high-performing engineering organization.
ServePost-trainEngineeringSanta Clara, CAApr 69
Deep Learning Software Engineer, TensorRT Performance - New College Grad 2026
NVIDIA is seeking a Deep Learning Software Engineer to analyze and improve the performance of their inference ecosystem, focusing on TensorRT and related frameworks. The role involves optimizing inference solutions for various NVIDIA accelerators, developing new model pipelines, and collaborating with cross-functional teams on generative AI, robotics, and vision/speech understanding applications.
ServeEngineeringSanta Clara, CA +1 · RemoteApr 49
Manager, Large Language Model Inference
Manager for Large Language Model Inference at NVIDIA, focusing on developing and optimizing LLM/VLM/VLA inference software for NVIDIA GPUs and hardware platforms. The role involves leading a team in specialized kernel development, runtime optimizations, and frameworks for LLM inference, with a strong emphasis on delivering production-grade, high-performance software.
ServeEngineeringSanta Clara, CA +1 · RemoteApr 49
Senior Deep Learning Software Engineer, TensorRT Performance
NVIDIA is seeking a Senior Deep Learning Software Engineer to analyze and improve the performance of their deep learning inference ecosystem, specifically focusing on TensorRT. The role involves optimizing inference solutions for various NVIDIA accelerators, contributing to inference frameworks, and developing new model pipelines for generative AI and other applications.
ServeEngineeringSanta Clara, CA +1 · RemoteApr 49
Senior Deep Learning Communication Architect
Senior Deep Learning Communication Architect role focused on optimizing communication performance for large-scale distributed deep learning training and inference. This involves identifying bottlenecks, designing efficient protocols, collaborating on hardware/software co-design, and exploring new communication technologies. The role requires deep understanding of parallelism techniques and experience with DNN frameworks and GPU computing.
ServePost-trainEngineeringSanta Clara, CA +1Apr 49
Senior Deep Learning Performance Architect - LPU
NVIDIA is seeking a Senior Deep Learning Performance Architect to focus on hardware-software co-design for AI Inference performance. The role involves designing GPU and system architectures, analyzing deep learning algorithms, building performance models, and collaborating with various teams to guide AI direction.
ServeEngineeringCA +1 · RemoteApr 49
Senior Systems Software Engineer - Deep Learning Solutions
Senior Systems Software Engineer focused on optimizing deep learning inference for autonomous vehicles and robotics on edge devices. Requires deep understanding of model architectures, kernel trace analysis, and evaluation of modern architectures on GPUs/SOCs, with a focus on TensorRT and compiler technology for embedded hardware.
ServePost-trainEngineeringSanta Clara, CAApr 49
AI Inference Performance Engineer
This role focuses on optimizing and benchmarking Generative AI inference performance on NVIDIA's hardware accelerators, specifically working with frameworks like TensorRT-LLM, SGLang, and vLLM. The engineer will drive industry benchmark results by implementing optimizations in quantization, scheduling, memory management, and distributed inference. They will also define and optimize cutting-edge workloads, architect distributed inference systems from single-GPU to rack-scale, establish performance methodology using profiling, and contribute to open-source projects. The role requires strong programming skills (Python/C++), expertise in DL frameworks, and a deep understanding of LLM/VLM architectures and inference mechanics.
ServeEngineeringSanta Clara, CAApr 49
Senior Deep Learning Engineer
Senior Deep Learning Engineer at NVIDIA focused on optimizing inference for next-generation AI workloads including multi-agent systems and generative multimodal models. The role involves characterizing emerging workloads and developing novel optimization methods across the inference stack, from algorithmic to system level, on NVIDIA hardware. Collaboration with research, framework development, and silicon architecture teams is key.
ServeAgentEngineeringRedmond, WA +1Apr 49
Senior Deep Learning Architect, LLM Inference
Senior Deep Learning Architect focused on LLM inference performance optimization, benchmarking, and contributing to deep learning software projects like PyTorch, TRT-LLM, vLLM, and SGLang. Requires strong knowledge of deep learning inference serving, PyTorch, profiling, and GPU microarchitecture.
ServeEngineeringSanta Clara, CAApr 49
Senior Deep Learning Compiler Engineer - XLA
Senior Deep Learning Compiler Engineer focused on optimizing inference and training performance for JAX and OpenXLA on NVIDIA GPUs. Develops compiler optimization algorithms, graph partitioning, tensor sharding, and code generation using MLIR, LLVM, and Triton.
ServePost-trainEngineeringSanta Clara, CA +5 · RemoteApr 49
Principal Software Engineer - AI Inference
Principal Software Engineer focused on advancing open-source LLM serving, specifically contributing to inference engines like vLLM and SGLang, optimizing them for NVIDIA GPUs and systems to achieve high-throughput, low-latency inference at scale. The role requires deep technical expertise in inference runtime architecture, GPU performance engineering, and distributed systems.
ServeEngineeringSanta Clara, CA +1 · RemoteApr 49
Senior DL Algorithms Engineer - Inference Performance
Senior DL Algorithms Engineer focused on optimizing inference performance for language and multimodal models using NVIDIA's inference stack (NIMs, TRT-LLM). Role involves profiling, analysis, and collaboration across hardware/software layers to maximize performance on GPUs.
ServeEngineeringSanta Clara, CA +1 · RemoteApr 49
Senior Research Scientist, AI Accelerator Design and VLSI
Research Scientist focused on AI accelerator hardware design, VLSI, and AI HW/SW co-design, applying machine learning and generative AI to hardware design flows and optimization techniques like quantization.
ServeResearchSanta Clara, CAApr 49
Research Scientist, AI Accelerator Design and VLSI - New College Grad 2026
Research Scientist role focused on AI Accelerator Design and VLSI, involving AI HW/SW Co-Design, quantization, and applying generative AI to hardware design. Requires a PhD and experience in VLSI, computer architecture, or numerical algorithms for AI. Collaborates on research prototypes and publishes findings.
ServeResearchSanta Clara, CAApr 49
Senior DGX Cloud AI Infrastructure Software Engineer
NVIDIA is seeking a Senior DGX Cloud AI Infrastructure Software Engineer to develop and optimize infrastructure software and tools for large-scale AI training, post-training, and inference. The role focuses on improving efficiency and resiliency of AI workloads, co-designing APIs, and enhancing AI platforms, requiring strong debugging and distributed systems experience.
ServePost-trainEngineeringSanta Clara, CA +4 · RemoteApr 29
Research Scientist, ML Systems - PhD New College Grad 2026
Research Scientist role focused on ML Systems, contributing to hardware, software, and infrastructure for ML systems at various scales. The role involves understanding and developing solutions for efficiency, scaling, and resilience in ML systems, with a focus on co-design of algorithms and systems. Requires a PhD and expertise in areas like OS, distributed systems, inference/training systems, data management, cloud computing, or computer architecture.
ServePost-trainResearchSanta Clara, CA +3Jan 99
Senior GPU Architect, Deep Learning
NVIDIA is seeking a Senior GPU Architect to design and enhance GPU architecture features specifically for deep learning workloads, covering both training and inference. The role involves developing simulators, mapping deep learning algorithms to hardware, and advancing parallel computation. Requires strong C++, C++, Perl, Python programming, and a background in computer architecture and high-performance computing.
ServeEngineeringSanta Clara, CA +2Jan 99
Senior Deep Learning Computer Architect
NVIDIA is seeking a Senior Deep Learning Computer Architect to design hardware accelerator and processor architectures for next-generation platforms, enabling state-of-the-art machine learning and data analytics algorithms. The role involves analyzing deep learning methods, proposing new features for acceleration, and studying their benefits, with a focus on LLM workloads and core deep learning kernels.
ServeEngineeringSanta Clara, CA +1Jan 99
Senior Deep Learning Performance Architect
Senior Deep Learning Performance Architect role at NVIDIA focused on developing and analyzing next-generation architectures for AI and HPC applications. This involves performance modeling, simulation, and understanding the interplay of hardware and software for deep learning training and inference.
ServePost-trainEngineeringSanta Clara, CA +1Jan 99
Senior Inference Engineer, AIConfigurator for Dynamo
Senior Inference Engineer role focused on optimizing LLM inference deployment configurations using AIConfigurator, integrating GPU systems, model serving, and performance modeling for NVIDIA platforms.
ServeEngineeringSanta Clara, CA +1 · Remote2w ago8
Systems Performance Engineer, Agentic AI Workloads – New College Grad 2026
This role focuses on modeling, simulating, and analyzing the system-level performance of agentic AI workloads in datacenter environments. The engineer will develop simulators, characterize LLM serving traffic, identify performance bottlenecks, and provide architectural recommendations for next-generation AI systems. The role requires strong programming skills in C++ and Python, a solid understanding of queueing theory, traffic modeling, and statistics, as well as fundamentals of deep learning and LLM inference serving.
ServeAgentEngineeringSanta Clara, CA +23w ago8
Deep Learning Computer Architect - New College Grad 2026
NVIDIA is seeking a Deep Learning Computer Architect to design hardware accelerator and processor architectures for next-generation platforms, enabling state-of-the-art machine learning and data analytics. The role involves analyzing DL methods, proposing new features for acceleration, and studying their benefits, with a focus on LLM workloads and deep learning kernels.
ServeEngineeringSanta Clara, CA +13w ago8
Senior Manager, Artificial Intelligence - Machine Learning Platform
Senior Manager for AI/ML Platform at NVIDIA, leading the development and management of tools and services for the entire AI/ML project lifecycle, focusing on large-scale model training and deployment efficiency. Requires extensive experience in AI/ML infrastructure, team leadership, and strategic vision for AI platforms.
ServePost-trainEngineeringSanta Clara, CA +2 · Remote4w ago8
Engineering Manager, Inference Benchmarking — AI Perf
Engineering Manager for NVIDIA's AIPerf platform, a standard for assessing LLM serving performance. The role involves leading a team to build and advance the platform, focusing on core infrastructure, accuracy of benchmark results, and advising on upstream engine integrations for various AI workloads (LLM, multimodal, diffusion, computer vision). Requires strong systems engineering, inference infrastructure, and open-source community experience.
ServeEngineeringSanta Clara, CA +5 · Remote4w ago8
GPU Performance Engineer - Neural Reconstruction
GPU Performance Engineer focused on optimizing neural reconstruction and Gaussian Splatting workloads. This role involves profiling, identifying bottlenecks, and improving performance in CUDA, PyTorch, and C++ for training and rendering, while ensuring reconstruction quality is maintained. It requires strong programming, GPU optimization, and performance analysis skills, with collaboration across research and engineering teams.
ServeDataEngineeringCA +5 · Remote4w ago8
AI Software Engineer, Kernel Libraries - New College Grad 2026
AI Software Engineer focused on developing inference systems software stack, including libraries, code generators, and GPU kernels for NVIDIA's hardware. The role involves innovating for efficient AI inference, optimizing kernels, designing abstractions for LLM serving engines, and building JIT compilers and runtimes. Collaboration with internal teams and contributions to open-source projects like FlashInfer, vLLM, and SGLang are expected.
ServeEngineeringSanta Clara, CA5w ago8
Senior AI Infrastructure Software Engineer - DGX Cloud
NVIDIA is seeking a Senior AI Infrastructure Software Engineer to design, build, and maintain AI platforms for large-scale AI training, inferencing, fine-tuning, and Agentic AI in production. The role involves developing platform and tools for AI/ML workload efficiency, resiliency, and observability, with a focus on distributed systems and Kubernetes.
ServeEngineeringSanta Clara, CA +3 · Remote6w ago8