Currently tracking 440 active AI roles, down 53% versus the prior 4 weeks. Primary focus: Serve · Engineering. Salary range $100k–$575k (avg $262k).
NVIDIA currently has 496 active AI-related job listings. The majority of these roles, 52%, are focused on serving infrastructure, with agents representing another significant segment at 23%. Engineering is the dominant function, with 441 positions. The United States leads hiring geographies with 287 roles, followed by China with 64. Frequent tech tags include model_serving, inference_infra, and agent_orchestration, suggesting a focus on deployment and management of AI models. Over the last 30 days, NVIDIA posted 214 new AI roles, a 27% decrease compared to the previous 30-day period.
NVIDIA currently has 487 active AI-related roles in our index. The most common open titles are: Deep Learning Performance Architect (4), Senior Deep Learning Performance Architect (4), AI Research Scientist (3), Developer Technology Engineer - AI (3), Manager, Deep Learning Algorithms (3). Most positions are in Engineering and Research.
NVIDIA's active AI hiring is concentrated in: serving infrastructure (54%), agents (21%), application (8%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
NVIDIA is hiring AI talent in: United States (286 roles), China (59 roles), Israel (50 roles), Germany (21 roles).
Job postings at NVIDIA most frequently reference: model serving, inference infra, agent orchestration, llm observability, multimodal.
In the past 30 days, NVIDIA has posted 110 new AI-related roles. That is a -50% change versus the prior 30 days (218 → 110).
| Title | Stage | AI score |
|---|---|---|
| Deep Learning Performance Architect NVIDIA is seeking a Deep Learning Performance Architect to analyze, model, and optimize deep learning system performance, particularly for LLM workloads, on state-of-the-art hardware architectures. This role influences future hardware and software design by collaborating with various internal teams. | Serve | 9 |
| Deep Learning Performance Architect NVIDIA is seeking a Deep Learning Performance Architect to optimize deep learning hardware and software architectures for edge devices, workstations, and data center GPUs. The role involves benchmarking, performance modeling, bottleneck identification, and exploring new hardware/software capabilities, with a focus on LLMs and generative AI. Experience with AI agents for engineering workflows is also mentioned. | ServePost-train |
| 9 |
| Senior Systems Software Engineer, AI Stack and Performance - DGX Station Senior Systems Software Engineer focused on optimizing AI stack performance and readiness on NVIDIA's DGX Station, a workstation-class AI computer. The role involves profiling, identifying bottlenecks, and driving optimizations across the full stack from GPU kernels to applications, ensuring AI workloads like LLM inference and agents run efficiently in multi-GPU, multi-user configurations. Collaboration with framework, compiler, and GPU architecture teams is critical. | ServeShip | 9 |
| Senior Software Engineer, DGX Cloud AI Infrastructure Senior Software Engineer to lead the bring-up, triage, benchmarking, analysis, and optimization of distributed training and inference workloads across NVIDIA GPU platforms at scale. This role involves setting technical direction for communication libraries, model frameworks, and inference/training stacks, leading performance and reliability investigations, defining benchmarking and qualification processes, and building resilience capabilities for large clusters. | ServePost-train | 9 |
| Senior Deep Learning Performance Architect NVIDIA is seeking a Senior Deep Learning Performance Architect to analyze and develop next-generation architectures for AI and HPC applications. The role involves developing innovative architectures, analyzing performance/cost/power trade-offs using models and simulators, understanding hardware/software interplay, and evaluating PPA for architectural decisions. Collaboration with software, product, and research teams is key. Requires MS/PhD, 6+ years experience, strong background in GPU/Deep Learning ASIC architecture for distributed training/inference, performance modeling, and ML/DL fundamentals, particularly transformer architectures. Proficiency in Python, C, C++ is essential. | Serve | 9 |
| AI Inference Performance Engineer - New College Grad 2026 NVIDIA is seeking an AI Inference Performance Engineer to optimize and benchmark GenAI inference on their accelerators, working with frameworks like TensorRT-LLM, SGLang, and vLLM. The role involves driving industry benchmark results, defining cutting-edge workloads, architecting distributed inference, establishing performance methodology, and influencing the ecosystem through open-source contributions and cross-functional partnerships. Requires strong programming skills, DL framework expertise, and a deep understanding of LLM inference mechanics. | Serve | 9 |
| Deep Learning Performance Software Engineer Develops GPU-accelerated deep learning software, including compilers, DSLs, and optimized kernels, for current and next-generation chips, focusing on performance analysis of AI workloads and integration with AI frameworks. | Serve | 9 |
| AI Computing Architect NVIDIA is seeking an AI Computing Architect to develop innovative architectures for deep learning performance and efficiency, analyze trade-offs using models and simulators, and prototype algorithms. The role requires strong programming skills, computer architecture background, and a foundation in machine learning. | ServePost-train | 9 |
| AI Workload and Networking Research Architect Research Architect role focused on optimizing AI workloads and networking infrastructure for NVIDIA's AI computing platforms, involving modeling, analysis, and influencing future product roadmaps. | ServePost-train | 9 |
| Senior Performance Architect, Nemotron NVIDIA is seeking a Senior Performance Architect for Nemotron to focus on deep model-system-hardware co-design. The role involves developing high-fidelity performance models to evaluate architectural choices, predict deployment efficiency, and ensure Pareto-optimal trade-offs for future Nemotron models. This position will guide future software and hardware roadmaps by modeling end-to-end performance impact of GenAI workflows and collaborating with research, framework, compiler, and hardware teams. | Serve | 9 |
| Senior DL Algorithms Engineer - Inference Performance Senior engineer to optimize LLM/Omni model inference performance on NVIDIA's accelerated inference software stack, working across hardware and software layers. Involves enabling and optimizing open models, contributing code to frameworks like TRT-LLM and vLLM, profiling bottlenecks, and benchmarking. | Serve | 9 |
| Machine Learning Applications and Compiler Engineer, LPX - New College Grad 2026 NVIDIA is seeking engineers to develop algorithms and optimizations for their LPX inference and compiler stack, working at the intersection of large-scale systems, compilers, and deep learning to optimize neural network workloads on future NVIDIA platforms. | Serve | 9 |
| Senior Software Engineer, AI Inference Systems Senior Software Engineer focused on building and optimizing AI inference systems for large-scale models, involving GPU kernel optimization, inference framework development (vLLM), benchmarking (MLPerf), and orchestration of distributed deployments. | Serve | 9 |
| Senior Software Engineer, AI Inference Systems Senior Software Engineer focused on building and optimizing AI inference systems, including vLLM, GPU kernels, and orchestration for large-scale model deployments. The role involves performance engineering, benchmarking (MLPerf), and potentially research integration. | Serve | 9 |
| Senior Deep Learning Software Engineer Senior Deep Learning Software Engineer to design and build an automated inference and deployment solution with a scalable architecture focusing on ease-of-use and compute efficiency. The role involves developing features in high-level frameworks, implementing a high-performance execution environment, and low-level GPU optimizations. | Serve | 9 |
| Principal Architect, AI Networking This role leads the research agenda and architectural direction for NVIDIA's AI networking systems, focusing on high-performance communication at scale. It involves original research, hardware-software co-optimization, and integrating networking into AI serving stacks, with a requirement to publish findings and ship production-grade software. | ServePretrain | 9 |
| Manager, Deep Learning – Autonomous Vehicles and Robotics Manager for a Deep Learning Engineering team focused on delivering production-quality deep learning solutions for autonomous vehicles and robotics on edge hardware. The role involves leading a team, defining technical initiatives, and collaborating with automotive OEMs and robotics partners to optimize solutions on NVIDIA platforms, working at the intersection of model architectures, compiler technology, and embedded deployment. | ServePost-train | 9 |
| Senior Deep Learning Algorithms Engineer - BioNeMo Senior Deep Learning Algorithms Engineer at NVIDIA to optimize biology and structural biology models (LLMs, VLMs) for inference performance on GPUs using TensorRT-LLM and related stacks. Focus on low-latency, high-throughput inference, quantization, custom GPU kernels, and production deployment. | ServePost-train | 9 |
| Senior AI Software Engineer, Kernel Libraries Senior AI Software Engineer focused on developing kernel libraries and inference systems software to accelerate AI workloads, including LLMs and agents, on NVIDIA's hardware. Responsibilities include innovating and optimizing kernels, designing abstractions for serving engines, and building compilers/runtimes. | Serve | 9 |
| Senior Software Engineer, AI and DL Kernel Libraries Develops libraries, code generators, and GPU kernel technologies for NVIDIA's AI inference systems software stack, focusing on accelerating AI inference through efficient kernels, abstractions, and runtimes for LLMs and agents. | Serve | 9 |
| Senior DL Software Engineer, Model Optimization and Edge Deployment - Autonomous Vehicles Senior DL Software Engineer focused on optimizing and deploying large multimodal models (LLMs/VLMs) for real-time robotic execution in autonomous vehicles. The role involves advanced model compression, quantization, pruning, distillation, and inference optimization techniques for edge deployment on NVIDIA hardware, integrating with C++ production environments. | ServeAgent | 9 |
| Senior Deep Learning Software Engineer, LLM Performance Senior Deep Learning Software Engineer focused on optimizing LLM inference performance on NVIDIA accelerators using frameworks like TensorRT LLM, VLLM, and Triton. The role involves implementing and scaling inference, serving, and deployment algorithms, collaborating with various teams, and contributing to NVIDIA/OSS LLM frameworks. | Serve | 9 |
| Senior Software Engineer - AI Inference Senior Software Engineer focused on optimizing and contributing to open-source LLM inference serving engines like vLLM and SGLang to run efficiently on NVIDIA GPUs, focusing on high-throughput, low-latency inference at scale. | Serve | 9 |
| Senior HPC and AI Network Software Architect NVIDIA is seeking a Senior HPC and AI Network Software Architect to design and build scalable AI infrastructure for distributed training and inference. The role involves developing software and hardware approaches to optimize communication efficiency and performance across large-scale systems, collaborating with AI framework teams and hardware teams. | ServePost-train | 9 |
| Senior Manager, Software Engineering - JAX Senior Engineering Manager to define and drive NVIDIA's JAX strategy, coordinating multiple teams to ensure JAX delivers peak performance across heterogeneous hardware (GPUs, CPUs, LPUs). The role involves supporting emerging needs across training, post-training, inference, and robotics, bridging new hardware capabilities with AI trends. Key responsibilities include driving engineering contribution strategy, promoting teamwork, building partnerships with open-source projects, designing processes, and leading a high-performing engineering organization. | ServePost-train | 9 |
| Deep Learning Software Engineer, TensorRT Performance - New College Grad 2026 NVIDIA is seeking a Deep Learning Software Engineer to analyze and improve the performance of their inference ecosystem, focusing on TensorRT and related frameworks. The role involves optimizing inference solutions for various NVIDIA accelerators, developing new model pipelines, and collaborating with cross-functional teams on generative AI, robotics, and vision/speech understanding applications. | Serve | 9 |
| Senior Deep Learning Engineer Senior Deep Learning Engineer at NVIDIA to optimize and deploy foundation models for physical AI applications (AVs, robots, video analytics) on GPU platforms, focusing on high-performance inference. | ServePost-train | 9 |
| Manager, Large Language Model Inference Manager for Large Language Model Inference at NVIDIA, focusing on developing and optimizing LLM/VLM/VLA inference software for NVIDIA GPUs and hardware platforms. The role involves leading a team in specialized kernel development, runtime optimizations, and frameworks for LLM inference, with a strong emphasis on delivering production-grade, high-performance software. | Serve | 9 |
| Senior Deep Learning Software Engineer, TensorRT Performance NVIDIA is seeking a Senior Deep Learning Software Engineer to analyze and improve the performance of their deep learning inference ecosystem, specifically focusing on TensorRT. The role involves optimizing inference solutions for various NVIDIA accelerators, contributing to inference frameworks, and developing new model pipelines for generative AI and other applications. | Serve | 9 |
| Senior Software Engineer, AI Inference Systems NVIDIA is seeking a Senior Software Engineer to build and optimize AI inference systems for large-scale models, focusing on extreme efficiency and performance across multi-GPU, multi-node, and multi-cloud environments. The role involves architecting inference stacks, optimizing GPU kernels and compilers, driving benchmarks (MLPerf), and orchestrating large-scale deployments. | Serve | 9 |
| Senior Deep Learning Communication Architect Senior Deep Learning Communication Architect role focused on optimizing communication performance for large-scale distributed deep learning training and inference. This involves identifying bottlenecks, designing efficient protocols, collaborating on hardware/software co-design, and exploring new communication technologies. The role requires deep understanding of parallelism techniques and experience with DNN frameworks and GPU computing. | ServePost-train | 9 |
| Senior Deep Learning Performance Architect - LPU NVIDIA is seeking a Senior Deep Learning Performance Architect to focus on hardware-software co-design for AI Inference performance. The role involves designing GPU and system architectures, analyzing deep learning algorithms, building performance models, and collaborating with various teams to guide AI direction. | Serve | 9 |
| Senior Systems Software Engineer - Deep Learning Solutions Senior Systems Software Engineer focused on optimizing deep learning inference for autonomous vehicles and robotics on edge devices. Requires deep understanding of model architectures, kernel trace analysis, and evaluation of modern architectures on GPUs/SOCs, with a focus on TensorRT and compiler technology for embedded hardware. | ServePost-train | 9 |
| AI Inference Performance Engineer This role focuses on optimizing and benchmarking Generative AI inference performance on NVIDIA's hardware accelerators, specifically working with frameworks like TensorRT-LLM, SGLang, and vLLM. The engineer will drive industry benchmark results by implementing optimizations in quantization, scheduling, memory management, and distributed inference. They will also define and optimize cutting-edge workloads, architect distributed inference systems from single-GPU to rack-scale, establish performance methodology using profiling, and contribute to open-source projects. The role requires strong programming skills (Python/C++), expertise in DL frameworks, and a deep understanding of LLM/VLM architectures and inference mechanics. | Serve | 9 |
| Senior Deep Learning Engineer Senior Deep Learning Engineer at NVIDIA focused on optimizing inference for next-generation AI workloads including multi-agent systems and generative multimodal models. The role involves characterizing emerging workloads and developing novel optimization methods across the inference stack, from algorithmic to system level, on NVIDIA hardware. Collaboration with research, framework development, and silicon architecture teams is key. | ServeAgent | 9 |
| Senior Systems Software Engineer - Deep Learning Solutions Senior Systems Software Engineer focused on deep learning inference optimization for autonomous vehicles and robotics on edge hardware. The role involves analyzing and improving deep learning models on NVIDIA platforms, benchmarking performance, evaluating emerging model architectures, and collaborating with compiler, runtime, and hardware teams to deliver inference solutions. | Serve | 9 |
| Senior Deep Learning Compiler Engineer - XLA Senior Deep Learning Compiler Engineer focused on optimizing inference and training performance for JAX and OpenXLA on NVIDIA GPUs. Develops compiler optimization algorithms, graph partitioning, tensor sharding, and code generation using MLIR, LLVM, and Triton. | ServePost-train | 9 |
| Principal Software Engineer - AI Inference Principal Software Engineer focused on advancing open-source LLM serving, specifically contributing to inference engines like vLLM and SGLang, optimizing them for NVIDIA GPUs and systems to achieve high-throughput, low-latency inference at scale. The role requires deep technical expertise in inference runtime architecture, GPU performance engineering, and distributed systems. | Serve | 9 |
| Senior DL Algorithms Engineer - Inference Performance Senior DL Algorithms Engineer focused on optimizing inference performance for language and multimodal models using NVIDIA's inference stack (NIMs, TRT-LLM). Role involves profiling, analysis, and collaboration across hardware/software layers to maximize performance on GPUs. | Serve | 9 |
| Senior Research Scientist, AI Accelerator Design and VLSI Research Scientist focused on AI accelerator hardware design, VLSI, and AI HW/SW co-design, applying machine learning and generative AI to hardware design flows and optimization techniques like quantization. | Serve | 9 |
| Research Scientist, AI Accelerator Design and VLSI - New College Grad 2026 Research Scientist role focused on AI Accelerator Design and VLSI, involving AI HW/SW Co-Design, quantization, and applying generative AI to hardware design. Requires a PhD and experience in VLSI, computer architecture, or numerical algorithms for AI. Collaborates on research prototypes and publishes findings. | Serve | 9 |
| Senior DGX Cloud AI Infrastructure Software Engineer NVIDIA is seeking a Senior DGX Cloud AI Infrastructure Software Engineer to develop and optimize infrastructure software and tools for large-scale AI training, post-training, and inference. The role focuses on improving efficiency and resiliency of AI workloads, co-designing APIs, and enhancing AI platforms, requiring strong debugging and distributed systems experience. | ServePost-train | 9 |
| Senior GPU Networking Architect This role focuses on building and optimizing GPU communication kernels for large-scale AI systems, linking GPU computing with networking. The Senior GPU Networking Architect will leverage deep knowledge of GPU architecture to improve kernel efficiency, minimize latency, and overlap computation with communication. Responsibilities include developing GPU-resident communication primitives, profiling and tuning kernels, and collaborating with various teams to co-design communication strategies. The role requires strong CUDA programming, GPU architecture fundamentals, and systems-level C/C++ development. | Serve | 9 |
| Senior Software Architect, AI Networking NVIDIA is looking for a Senior Software Architect to design and optimize inference infrastructure for large language models running on GPU clusters. The role involves working across software and hardware domains to define deployment and scaling strategies, optimize latency and throughput, and collaborate with various teams to ensure high-performance solutions. | Serve | 9 |
| Senior Deep Learning Algorithm Engineer Senior Deep Learning Algorithm Engineer at NVIDIA focused on optimizing deep learning training and inference workloads on state-of-the-art hardware and software platforms. The role involves performance analysis, profiling, and implementation of production-quality software, with a focus on squeezing performance from hardware and software stacks. | ServePost-train | 9 |
| Research Scientist, ML Systems - PhD New College Grad 2026 Research Scientist role focused on ML Systems, contributing to hardware, software, and infrastructure for ML systems at various scales. The role involves understanding and developing solutions for efficiency, scaling, and resilience in ML systems, with a focus on co-design of algorithms and systems. Requires a PhD and expertise in areas like OS, distributed systems, inference/training systems, data management, cloud computing, or computer architecture. | ServePost-train | 9 |
| Senior GPU Architect, Deep Learning NVIDIA is seeking a Senior GPU Architect to design and enhance GPU architecture features specifically for deep learning workloads, covering both training and inference. The role involves developing simulators, mapping deep learning algorithms to hardware, and advancing parallel computation. Requires strong C++, C++, Perl, Python programming, and a background in computer architecture and high-performance computing. | Serve | 9 |
| Senior Deep Learning Computer Architect NVIDIA is seeking a Senior Deep Learning Computer Architect to design hardware accelerator and processor architectures for next-generation platforms, enabling state-of-the-art machine learning and data analytics algorithms. The role involves analyzing deep learning methods, proposing new features for acceleration, and studying their benefits, with a focus on LLM workloads and core deep learning kernels. | Serve | 9 |
| Senior Deep Learning Performance Architect Senior Deep Learning Performance Architect role at NVIDIA focused on developing and analyzing next-generation architectures for AI and HPC applications. This involves performance modeling, simulation, and understanding the interplay of hardware and software for deep learning training and inference. | ServePost-train | 9 |
| Senior Deep Learning Software Engineer, Inference Senior Software Engineer specializing in Deep Learning Inference, focusing on optimizing GPU-accelerated software for large-scale model serving and inference using frameworks like SGLang and vLLM. The role involves performance tuning, implementing latest algorithms, and scaling performance across NVIDIA accelerators. | Serve | 9 |