Currently tracking 440 active AI roles, down 53% versus the prior 4 weeks. Primary focus: Serve · Engineering. Salary range $100k–$575k (avg $262k).
NVIDIA currently has 496 active AI-related job listings. The majority of these roles, 52%, are focused on serving infrastructure, with agents representing another significant segment at 23%. Engineering is the dominant function, with 441 positions. The United States leads hiring geographies with 287 roles, followed by China with 64. Frequent tech tags include model_serving, inference_infra, and agent_orchestration, suggesting a focus on deployment and management of AI models. Over the last 30 days, NVIDIA posted 214 new AI roles, a 27% decrease compared to the previous 30-day period.
NVIDIA currently has 487 active AI-related roles in our index. The most common open titles are: Deep Learning Performance Architect (4), Senior Deep Learning Performance Architect (4), AI Research Scientist (3), Developer Technology Engineer - AI (3), Manager, Deep Learning Algorithms (3). Most positions are in Engineering and Research.
NVIDIA's active AI hiring is concentrated in: serving infrastructure (54%), agents (21%), application (8%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
NVIDIA is hiring AI talent in: United States (286 roles), China (59 roles), Israel (50 roles), Germany (21 roles).
Job postings at NVIDIA most frequently reference: model serving, inference infra, agent orchestration, llm observability, multimodal.
In the past 30 days, NVIDIA has posted 110 new AI-related roles. That is a -50% change versus the prior 30 days (218 → 110).
| Title | Stage | AI score |
|---|---|---|
| AI Computing Development Engineer, TensorRT and TensorRT-LLM AIGV NVIDIA is seeking software engineers to develop and optimize inferencing software (TensorRT/TensorRT-LLM) for AI computing. The role involves performance analysis, tuning, integrating AI advancements, and collaborating across teams to shape machine learning inferencing on NVIDIA platforms. Requires strong programming skills, experience with deep learning frameworks, and a proactive approach. | Serve | 8 |
| DL System Software Engineer - AI Platform NVIDIA is seeking a DL System Software Engineer to join their AI Platform team. The role involves developing and building solutions for scheduling large-scale AI training and inference workloads on GPU clusters, optimizing performance and efficiency for large models. The engineer will work on core infrastructure, resource management, and GPU scheduling, contributing to NVIDIA's AI platform. | Serve |
| 8 |
| Software Engineer, AI Networking Architect NVIDIA is seeking an AI Networking Architect to optimize AI workload performance by analyzing AI models, distributed training, and inference workloads, and translating research insights into software, hardware, and networking architecture requirements. The role involves building platforms and simulations to evaluate trade-offs and influence future NVIDIA product roadmaps. | ServeAgent | 8 |
| GPU Performance Engineer - Neural Reconstruction GPU Performance Engineer focused on optimizing neural reconstruction and Gaussian Splatting workloads, involving PyTorch, CUDA, and GPU profiling to improve training and rendering performance. | ServePost-train | 8 |
| Developer Technology Engineer - AI NVIDIA is seeking an AI Developer Technology Engineer to study and develop cutting-edge deep learning techniques, analyze and optimize performance on GPU architectures, and work with customers to provide AI solutions using GPUs. The role involves close collaboration with internal NVIDIA teams to influence future architectures and software platforms. | Serve | 8 |
| Systems Performance Engineer, Agentic AI Workloads – New College Grad 2026 This role focuses on modeling, simulating, and analyzing the system-level performance of agentic AI workloads in datacenter environments. The engineer will develop simulators, characterize LLM serving traffic, identify performance bottlenecks, and provide architectural recommendations for next-generation AI systems. The role requires strong programming skills in C++ and Python, a solid understanding of queueing theory, traffic modeling, and statistics, as well as fundamentals of deep learning and LLM inference serving. | ServeAgent | 8 |
| Deep Learning Computer Architect - New College Grad 2026 NVIDIA is seeking a Deep Learning Computer Architect to design hardware accelerator and processor architectures for next-generation platforms, enabling state-of-the-art machine learning and data analytics. The role involves analyzing DL methods, proposing new features for acceleration, and studying their benefits, with a focus on LLM workloads and deep learning kernels. | Serve | 8 |
| Manager, Deep Learning Algorithms Manager to lead engineering activities for productizing Deep Learning models, focusing on implementing and optimizing state-of-the-art algorithms for GPU-accelerated platforms. The role involves leading a team, collaborating with internal partners on roadmap development, and deploying training and inference workloads. | ServeData | 8 |
| Engineering Manager, Inference Benchmarking — AI Perf Engineering Manager for NVIDIA's AIPerf platform, a standard for assessing LLM serving performance. The role involves leading a team to build and advance the platform, focusing on core infrastructure, accuracy of benchmark results, and advising on upstream engine integrations for various AI workloads (LLM, multimodal, diffusion, computer vision). Requires strong systems engineering, inference infrastructure, and open-source community experience. | Serve | 8 |
| AI Computing Development Engineer, TensorRT and TensorRT-LLM NVIDIA is seeking software engineers to develop and optimize AI inference software (TensorRT/TensorRT-LLM) for GPUs. The role involves performance analysis, tuning, integrating new advancements, and collaborating across teams to shape the future of machine learning inferencing. | Serve | 8 |
| GPU Performance Engineer - Neural Reconstruction GPU Performance Engineer focused on optimizing neural reconstruction and Gaussian Splatting workloads. This role involves profiling, identifying bottlenecks, and improving performance in CUDA, PyTorch, and C++ for training and rendering, while ensuring reconstruction quality is maintained. It requires strong programming, GPU optimization, and performance analysis skills, with collaboration across research and engineering teams. | ServeData | 8 |
| Senior DGX Cloud AI Infrastructure Software Engineer NVIDIA is seeking a Senior DGX Cloud AI Infrastructure Software Engineer to design, build, and maintain AI infrastructure for large-scale AI training and inferencing. The role involves optimizing efficiency and resiliency of AI workloads, developing scalable AI and Data infrastructure tools, and ensuring high availability of AI systems. | ServeData | 8 |
| Senior AI Infrastructure Software Engineer - DGX Cloud NVIDIA is seeking a Senior AI Infrastructure Software Engineer to design, build, and maintain AI platforms for large-scale AI training, inferencing, fine-tuning, and Agentic AI in production. The role involves developing platform and tools for AI/ML workload efficiency, resiliency, and observability, with a focus on distributed systems and Kubernetes. | Serve | 8 |
| Senior Performance Compiler Engineer - Triton Senior Performance Compiler Engineer to work on the open-source Triton compiler project, focusing on using compilers to improve AI performance on NVIDIA GPUs for large language models, agents, and other AI applications. The role involves investigating GPU hardware, designing and implementing compiler technology using MLIR to optimize kernel descriptions for efficient GPU code generation, and collaborating with internal teams. | Serve | 8 |
| Senior GPU System Architect Seeking a Senior GPU System Architect to design multi-GPU scale-up and scale-out systems for AI and HPC datacenters. The role involves defining system architectures that integrate GPU compute, memory, and interconnects for optimal AI performance and scalability. Requires deep experience in system-level fabric/networking architecture and hardware-software co-design. | Serve | 8 |
| Senior Deep Learning Performance Architect Senior Deep Learning Performance Architect at NVIDIA to design and evaluate hardware architectures for AI/HPC applications, focusing on LLM inference and training performance, and optimizing system bottlenecks. | ServePost-train | 8 |
| Senior Data Center Performance Engineer - Benchmarking and Optimization Senior Data Center Performance Engineer at NVIDIA focused on benchmarking and optimizing data center platforms for AI training, inference, and HPC workloads. Responsibilities include designing benchmarks, characterizing workloads, identifying bottlenecks, and driving performance improvements through system tuning and architectural recommendations. | Serve | 8 |
| NCX Engineer, AI Accelerator This role focuses on engineering and deploying AI infrastructure and solutions for strategic customers, optimizing large-scale training and inference workloads on NVIDIA's AI platform. It involves MLOps, Kubernetes, GPU scheduling, and performance tuning, with a strong emphasis on customer-facing technical support and collaboration. | ServePost-train | 8 |
| Machine Learning Applications and Compiler Engineer, LPX - New College Grad 2026 NVIDIA is seeking engineers to develop algorithms and optimizations for their LPX inference and compiler stack, working at the intersection of large-scale systems, compilers, and deep learning to optimize neural network workloads on future NVIDIA platforms. The role involves building and maintaining high-performance runtime and compiler components, defining workload mappings, integrating with the SW ecosystem, benchmarking, profiling, and collaborating with hardware teams. It also includes prototyping new compilation techniques and publishing technical work. | Serve | 8 |
| Senior Deep Learning Framework Communications Engineer Senior Deep Learning Framework Communications Engineer at NVIDIA, focusing on integrating and optimizing communication libraries (NCCL, NVSHMEM) within AI frameworks (PyTorch, TRT-LLM, vLLM, JAX) to enhance performance for large-scale AI training and inference. The role involves deep analysis of AI workloads, compiler improvements, and kernel authoring for multi-GPU systems. | Serve | 8 |
| Director, System Software Engineering - Metropolis Accelerated and Inferencing Software NVIDIA is seeking a Director of System Software Engineering to lead teams responsible for the full lifecycle of Vision AI strategy, from model onboarding to production deployment. The role focuses on transforming foundation models into real-time, GPU-accelerated video intelligence systems, scaling multimodal reasoning, and enabling agentic development workflows. Key responsibilities include architecting and operationalizing inference acceleration, driving implementations of frameworks like TensorRT and VLLM, collaborating with partners on custom models, and ensuring performance benchmarking. The ideal candidate has extensive experience in deep learning, GPU optimization, and leading engineering teams in embedded and enterprise platforms. | ServeAgent | 8 |
| Senior Software Architect - Deep Learning and HPC Communications Senior Software Architect role at NVIDIA focused on designing and implementing next-generation data center platforms and scalable communication software for AI and HPC workloads. The role involves investigating performance bottlenecks, developing new communication technologies, exploring hardware/software co-design, and building proofs-of-concept to drive innovation in large-scale GPU clusters. | Serve | 8 |
| Senior Software Engineer, Deep Learning Inference Senior Software Engineer focused on optimizing deep learning inference for LLMs and omnimodal architectures on NVIDIA hardware, including GPU kernel tuning, distributed inference, and contributing to open-source libraries. | Serve | 8 |
| Senior Hardware Architect, Deep Learning GPU and System Senior Hardware Architect role focused on designing next-generation GPUs and systems to advance the state of AI, analyzing deep learning workloads, and proposing new features for acceleration. Requires 8+ years of experience in performance, hardware architecture, and deep learning analysis. | Serve | 8 |
| Senior Software Engineer - VLM Microservices for Neural Reconstruction Senior Software Engineer to design, build, and optimize containerized inference execution for 3D Vision Language Models (VLMs) for neural reconstruction, turning research into production-grade software (NIMs). The role involves developing benchmarks, releasing and maintaining models, contributing to open-source projects like vLLM, and collaborating with research and product teams. Requires experience with AI distributed systems, inference platforms, Python/C++, and software engineering fundamentals. | ServePost-train | 8 |
| Principal AI and ML Infra Software Engineer, GPU Clusters This role focuses on enhancing the efficiency of AI and ML research on GPU clusters by collaborating with researchers to identify and address infrastructure deficiencies. The engineer will optimize performance, monitor resource utilization, and contribute to the AI/ML infrastructure ecosystem, keeping up-to-date with the latest AI/ML technologies. | Serve | 8 |
| Senior Deep Learning Software Engineer - Autonomous Vehicles Senior Deep Learning Software Engineer focused on developing and productizing deep learning solutions for autonomous vehicles. The role involves training, fine-tuning, optimizing perception DNNs, applying quantization, improving DNN architectures, and enhancing inference speed and power consumption. It requires strong programming skills, experience with deep learning frameworks, computer vision tasks, and familiarity with CNNs and Transformer architectures. Experience with low precision inference, quantization, and NVIDIA software libraries is a plus. | ServePost-train | 8 |
| Compiler Engineer - AI Inference NVIDIA is seeking an AI Compiler Engineer to optimize kernel generation and computational graph optimizations for AI inference and training workloads on next-generation GPUs. The role involves hands-on development, collaboration on hardware/software co-design, and scaling AI deployments in datacenters. | ServePost-train | 8 |
| Senior Software Engineer, Metropolis Vision AI Senior Software Engineer to develop and optimize high-performance Vision AI pipelines and large-scale distributed services for processing video, image, and 3D data. The role involves crafting real-time systems, developing multi-modal perception, using simulation/synthetic data, and profiling/tuning GPU-accelerated inference pipelines. Collaboration with research and platform teams is key, with an emphasis on bringing research into production at scale. | ServePost-train | 8 |
| Senior Software Engineer, AI Networking Senior Software Engineer role focused on building and productizing ML tools for optimizing AI workloads (LLM training/inference) across GPU/CPU clusters, with a focus on networking and system resource utilization. Involves distributed deep learning, ML-based optimization techniques, and performance analysis. | ServeAgent | 8 |
| Senior Performance Engineer - LLM Inference Frameworks NVIDIA is seeking a Senior Performance Engineer to optimize LLM inference infrastructure on GPUs, focusing on throughput, memory efficiency, and scalability. The role involves designing and implementing high-performance pipelines, profiling, tuning model execution, and innovating techniques like Speculative Decoding and quantization. Experience with deep learning frameworks and performance debugging is required. | Serve | 8 |
| AI Computing Development Engineer, TensorRT-LLM NVIDIA is seeking software engineers to develop and optimize inferencing software for AI models, specifically focusing on TensorRT-LLM. This role involves performance analysis, tuning, and collaboration across teams to advance machine learning inferencing capabilities. | Serve | 8 |
| Senior Software Engineer, JAX Senior Software Engineer to develop NVIDIA's AI platform, focusing on performance optimizations in deep learning frameworks using JAX. The role involves designing and implementing JAX core components, driving performance on NVIDIA products, and building tools to increase efficiency for AI-based systems. | Serve | 8 |
| Senior Architect - Server Performance NVIDIA is seeking architects to drive architectural performance for its next-generation AI server systems. This position demands a unique capability to bridge deep architectural knowledge, workload analysis, and hands-on silicon investigations. Candidates should be adept at working directly with silicon, high-level models, and simulators. Responsibilities include conducting performance investigations on both NVIDIA and competitive platforms, and developing targeted microbenchmarks to examine specific architectural aspects. The role does not heavily involve modeling tasks (functional or performance), though occasional focused assignments may arise. | Serve | 8 |
| Principal Deep Learning Communication Architect NVIDIA is seeking a Principal Deep Learning Communication Architect to lead the technical roadmap for communication libraries across next-generation platforms, ensuring seamless scaling of models to massive clusters. The role involves designing and optimizing communication primitives for heterogeneous interconnects, co-designing with application developers and silicon architects, and developing analytical models for system behavior. Expertise in parallel computing, HPC/distributed deep learning, inference engines, and GPU architecture is required. | ServeAgent | 8 |
| Developer Technology Engineer - AI NVIDIA Developer Technology Engineer focused on optimizing AI workloads, particularly large language models (LLMs), on NVIDIA's GPU platform. The role involves deep dives into application performance, GPU kernel optimization, distributed training and inference, and collaboration with various internal teams and external developers. It requires strong software engineering skills, parallel programming expertise, and a focus on performance analysis and tuning. | ServePost-train | 8 |
| AI and FSI Developer Technology Engineer - New College Grad 2026 NVIDIA is seeking an AI and FSI Developer Technology Engineer to optimize AI and HPC workloads on NVIDIA GPUs and CPUs, focusing on performance tuning and eliminating bottlenecks for financial markets. The role involves research, development, analysis, and collaboration with experts to improve performance across the stack, from algorithms to kernels. The engineer will also publish and present their work and influence future hardware/software designs. | Serve | 8 |
| Deep Learning Algorithms Engineer - ACOT NVIDIA is looking for an AI Acceleration & Optimization Engineer to optimize the performance, scalability, and efficiency of AI models (LLMs, VLMs, diffusion, multimodal) on NVIDIA GPU platforms. The role involves profiling, identifying bottlenecks, and applying optimization techniques like quantization and kernel fusion, using tools such as CUDA, TensorRT, and Nsight. Collaboration with various teams (algorithms, systems, hardware, research, CUDA, compiler, frameworks) is key to bringing models from research to production. | ServePost-train | 8 |
| Senior Machine Learning Applications and Compiler Engineer, LPX NVIDIA is seeking a Senior Machine Learning Applications and Compiler Engineer to develop algorithms and optimizations for their LPX inference and compiler stack, working at the intersection of large-scale systems, compilers, and deep learning to map neural network workloads onto future NVIDIA platforms. | Serve | 8 |
| Senior Machine Learning Applications and Compiler Engineer, LPX Develops algorithms and optimizations for NVIDIA's LPX inference and compiler stack, focusing on mapping neural network workloads onto future NVIDIA platforms and optimizing end-to-end inference performance. Requires strong software engineering, compiler/runtime development, and deep learning framework experience. | Serve | 8 |
| Senior Software Engineer – TensorRT Edge-LLM Senior Software Engineer to develop and optimize a state-of-the-art inference framework for Large Language, Vision-Language, and Multimodal models on edge and embedded platforms, focusing on real-time performance and constrained environments. | Serve | 8 |
| Senior Performance Engineer - Deep Learning Senior Performance Engineer at NVIDIA focused on optimizing Deep Learning models and frameworks (PyTorch, JAX) for NVIDIA GPUs. The role involves building and supporting Transformer Engine, collaborating on systems research for performance improvements, implementing and benchmarking new DL models, contributing to MLPerf, and engaging with the open-source community and enterprise customers. It also involves influencing future hardware and software design. | ServePost-train | 8 |
| Senior Software Engineer, Quantized Inference Senior Software Engineer focused on optimizing quantized inference for LLMs by implementing recipes, developing kernels, and collaborating on inference engines like vLLM and TRT-LLM. The role involves model export pipelines, benchmarking, and data analysis tooling. | Serve | 8 |
| Senior Compiler Engineer, AI Inference Platforms NVIDIA is seeking a Senior Compiler Engineer to join its Deep Learning & AI Compiler (DLC) team. The role involves analyzing deep learning networks, developing compiler optimization algorithms, and collaborating with framework and architecture teams to accelerate AI inference performance on NVIDIA GPUs. The compiler is critical for data centers, personal devices, automotive, and robotics, focusing on inference performance, build time, memory footprints, and ease of use. | Serve | 8 |
| Principal AI Developer Technology Engineer This role focuses on researching and developing techniques to accelerate AI workloads (deep learning, machine learning) on advanced computer architectures, specifically GPUs. The engineer will perform in-depth analysis and optimization of complex AI and HPC algorithms, publish findings, and influence future hardware/software design. Requires deep C/C++ programming, parallel programming (CUDA, etc.), low-level performance optimization, and CPU/GPU architecture expertise. | Serve | 8 |
| Principal AI Developer Technology Engineer Seeking a Principal Developer Technology Engineer to research and develop techniques for GPU acceleration of AI workloads, focusing on performance optimization of deep learning and HPC algorithms on modern CPU and GPU architectures. This role involves collaborating with internal teams and the developer community, influencing hardware/software design, and publishing findings. | Serve | 8 |
| Senior AI Performance and Efficiency Engineer Senior AI/ML Performance and Efficiency Engineer focused on optimizing GPU cluster performance for AI/ML researchers by addressing infrastructure and application bottlenecks. This role involves building tools, analyzing efficiency, and collaborating across teams to improve hardware, software, and infrastructure usage for various ML workloads like Robotics, Autonomous vehicles, LLMs, and Videos. | Serve | 8 |
| Senior AI Developer Technology Engineer Senior Developer Technology Engineer focused on researching and developing techniques to GPU accelerate AI workloads, optimizing performance on modern CPU and GPU architectures, and collaborating with the developer community and internal teams to influence next-generation hardware and software design. | Serve | 8 |
| Engineering Manager, AI Developer Technology Engineering Manager for NVIDIA's AI Developer Technology team, focused on leading a team to optimize and develop algorithms for Deep Learning and Machine Learning applications, influencing next-generation hardware/software, and collaborating with customers and internal teams. The role involves optimizing training and inference performance on NVIDIA hardware. | ServePost-train | 8 |
| Senior Developer Technology Engineer - AI Senior Developer Technology Engineer focused on researching and optimizing AI/ML workloads for GPU acceleration, involving deep analysis, performance tuning, and collaboration with the developer community and internal teams to influence next-generation hardware and software design. | Serve | 8 |