Currently tracking 440 active AI roles, down 53% versus the prior 4 weeks. Primary focus: Serve · Engineering. Salary range $100k–$575k (avg $262k).
NVIDIA currently has 496 active AI-related job listings. The majority of these roles, 52%, are focused on serving infrastructure, with agents representing another significant segment at 23%. Engineering is the dominant function, with 441 positions. The United States leads hiring geographies with 287 roles, followed by China with 64. Frequent tech tags include model_serving, inference_infra, and agent_orchestration, suggesting a focus on deployment and management of AI models. Over the last 30 days, NVIDIA posted 214 new AI roles, a 27% decrease compared to the previous 30-day period.
NVIDIA currently has 487 active AI-related roles in our index. The most common open titles are: Deep Learning Performance Architect (4), Senior Deep Learning Performance Architect (4), AI Research Scientist (3), Developer Technology Engineer - AI (3), Manager, Deep Learning Algorithms (3). Most positions are in Engineering and Research.
NVIDIA's active AI hiring is concentrated in: serving infrastructure (54%), agents (21%), application (8%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
NVIDIA is hiring AI talent in: United States (286 roles), China (59 roles), Israel (50 roles), Germany (21 roles).
Job postings at NVIDIA most frequently reference: model serving, inference infra, agent orchestration, llm observability, multimodal.
In the past 30 days, NVIDIA has posted 110 new AI-related roles. That is a -50% change versus the prior 30 days (218 → 110).
| Title | Stage | AI score |
|---|---|---|
| Deep Learning Performance Architect NVIDIA is seeking a Deep Learning Performance Architect to analyze, model, and optimize deep learning system performance, particularly for LLM workloads, on state-of-the-art hardware architectures. This role influences future hardware and software design by collaborating with various internal teams. | Serve | 9 |
| Deep Learning Performance Architect NVIDIA is seeking a Deep Learning Performance Architect to optimize deep learning hardware and software architectures for edge devices, workstations, and data center GPUs. The role involves benchmarking, performance modeling, bottleneck identification, and exploring new hardware/software capabilities, with a focus on LLMs and generative AI. Experience with AI agents for engineering workflows is also mentioned. | ServePost-train |
| 9 |
| Deep Learning Performance Software Engineer Develops GPU-accelerated deep learning software, including compilers, DSLs, and optimized kernels, for current and next-generation chips, focusing on performance analysis of AI workloads and integration with AI frameworks. | Serve | 9 |
| AI Computing Architect NVIDIA is seeking an AI Computing Architect to develop innovative architectures for deep learning performance and efficiency, analyze trade-offs using models and simulators, and prototype algorithms. The role requires strong programming skills, computer architecture background, and a foundation in machine learning. | ServePost-train | 9 |
| Deep Learning Solution Architect NVIDIA is seeking a Deep Learning Solution Architect to design and optimize production-grade generative AI solutions for enterprise customers, focusing on LLM training, RAG, and agentic inference using NVIDIA's ecosystem. | ServeAgent | 9 |
| Deep Learning Performance Software Engineer Develops GPU-accelerated deep learning software, including compilers, DSLs, and optimized kernels, for current and next-generation chips, focusing on performance analysis of AI workloads and integration with AI frameworks. | Serve | 9 |
| AI Computing Software Development Engineer, LLM Inference Software Development Engineer focused on LLM inference software (TensorRT LLM and TensorRT Edge LLM) at NVIDIA, involving crafting, scaling, performance analysis, optimization, and tuning of inferencing software for GPUs. The role requires strong C/C++ skills, experience with deep learning frameworks, and collaboration across teams. | Serve | 8 |
| AI Computing Software Development Engineer, TensorRT NVIDIA is seeking an AI Computing Software Development Engineer for its TensorRT team to craft and develop robust, scalable inferencing software for GPUs. The role involves performance analysis, optimization, tuning, and collaborating with various teams to guide the direction of machine learning inferencing. Requires a Masters or higher degree, 2+ years of software development experience, strong C/C++ skills, and familiarity with deep learning frameworks. | Serve | 8 |
| AI Computing Development Engineer, TensorRT and TensorRT-LLM AIGV NVIDIA is seeking software engineers to develop and optimize inferencing software (TensorRT/TensorRT-LLM) for AI computing. The role involves performance analysis, tuning, integrating AI advancements, and collaborating across teams to shape machine learning inferencing on NVIDIA platforms. Requires strong programming skills, experience with deep learning frameworks, and a proactive approach. | Serve | 8 |
| Developer Technology Engineer - AI NVIDIA is seeking an AI Developer Technology Engineer to study and develop cutting-edge deep learning techniques, analyze and optimize performance on GPU architectures, and work with customers to provide AI solutions using GPUs. The role involves close collaboration with internal NVIDIA teams to influence future architectures and software platforms. | Serve | 8 |
| AI Computing Development Engineer, TensorRT and TensorRT-LLM NVIDIA is seeking software engineers to develop and optimize AI inference software (TensorRT/TensorRT-LLM) for GPUs. The role involves performance analysis, tuning, integrating new advancements, and collaborating across teams to shape the future of machine learning inferencing. | Serve | 8 |
| Deep Learning Performance Architect NVIDIA is seeking a Deep Learning Performance Architect to develop and optimize GPU-accelerated deep learning inference software, focusing on highly optimized kernels, performance analysis, and tuning. The role involves collaboration across various domains like automotive, image, and speech understanding, and requires strong C/C++ skills and GPU programming experience. | Serve | 8 |
| Senior DGX Cloud AI Infrastructure Software Engineer NVIDIA is seeking a Senior DGX Cloud AI Infrastructure Software Engineer to design, build, and maintain AI infrastructure for large-scale AI training and inferencing. The role involves optimizing efficiency and resiliency of AI workloads, developing scalable AI and Data infrastructure tools, and ensuring high availability of AI systems. | ServeData | 8 |
| AI Computing Software Development Engineer, TensorRT NVIDIA is seeking an AI Computing Software Development Engineer for its TensorRT team to craft and develop robust inferencing software for GPUs, focusing on performance analysis, optimization, and tuning. The role involves collaborating with various teams to guide machine learning inferencing direction and potentially publishing key results. | Serve | 8 |
| AI Computing Development Engineer, TensorRT-LLM NVIDIA is seeking software engineers to develop and optimize inferencing software for AI models, specifically focusing on TensorRT-LLM. This role involves performance analysis, tuning, and collaboration across teams to advance machine learning inferencing capabilities. | Serve | 8 |
| Developer Technology Engineer - AI NVIDIA Developer Technology Engineer focused on optimizing AI workloads, particularly large language models (LLMs), on NVIDIA's GPU platform. The role involves deep dives into application performance, GPU kernel optimization, distributed training and inference, and collaboration with various internal teams and external developers. It requires strong software engineering skills, parallel programming expertise, and a focus on performance analysis and tuning. | ServePost-train | 8 |
| Senior Solutions Architect - KV Cache and AI Storage Senior Solutions Architect focused on building LLM inference platforms using NVIDIA GPUs, KV cache, and tiered memory solutions. The role involves technical exploration with customers, performance analysis, and translating customer needs into product roadmaps. | Serve | 8 |
| Solutions Architect - Top AI Labs Solutions Architect role at NVIDIA focusing on optimizing LLM inference and training acceleration, contributing to open-source frameworks like SGLang and vLLM, and developing KV cache offloading. Requires strong programming, systems fundamentals, and experience in performance analysis. | ServePretrain | 8 |
| Solutions Architect, Generative AI - CSP NVIDIA is seeking an AI-focused Solutions Architect with expertise in LLMs, generative AI, agentic AI, or recommender systems. The role involves providing technical expertise to customers, assisting with GPU infrastructure for AI, optimizing training and inference pipelines, and gathering customer feedback for product development. This position requires 3+ years of experience in AI for large models and proficiency with AI tools. | ServePost-train | 8 |
| Senior Deep Learning Solution Architect Senior Deep Learning Solution Architect at NVIDIA, focusing on LLM inference and training acceleration, performance optimization, and contributing to open-source frameworks like SGLang and vLLM. The role involves developing and optimizing inference frameworks, KV cache offloading, and exploring distributed training performance. | ServePost-train | 8 |
| Solutions Architect - CPU and LPU NVIDIA Solutions Architect focused on optimizing AI inference workloads across CPU, GPU, and LPU platforms for customers. The role involves technical expertise, proof-of-concept development, and optimizing AI efficiency in heterogeneous environments. | ServeAgent | 8 |
| NIM Solutions Architect This role focuses on deploying and optimizing large models using NVIDIA's Inference Microservice (NIM) and related tools. The Solutions Architect will package optimized models (LLM, VLM, etc.) into containers for deployment, refine NIM tools for the community, and design/implement agentic AI solutions for customer scenarios. The role requires strong programming skills, experience with inference engines, and MLOps practices, with a focus on performance engineering and model optimization. | ServeAgent | 8 |
| Solution Architecture Intern, AI in Industry - 2026 NVIDIA is seeking an AI in Industry Solution Architecture Intern to help optimize large models, develop AI workflows, and deliver advanced AI solutions. The intern will provide technical support, design and implement optimizations for AI models, and set up model training or inference to identify and resolve bottlenecks. This role involves working with various AI models and inference frameworks, conducting research, and collaborating with global teams. | ServePost-train | 8 |
| Performance Engineer Intern, Deep Learning and HPC - 2026 NVIDIA is seeking a Performance Engineer Intern to support performance testing of datacenter products and applications, focusing on AI workloads like LLM training and inference, as well as HPC. The role involves benchmarking, profiling, analyzing performance, developing automation scripts, and collaborating with internal teams. The intern will aggregate and report testing data for sales, marketing, and engineering teams, and assist in developing tools and processes for automated testing. | ServePost-train | 8 |
| System Software Architect, AI and GPU Networking NVIDIA is seeking a System Software Architect to research and develop advanced networking solutions for AI data centers, focusing on accelerating AI workloads, inference, and model serving. The role involves enhancing GPU networking offerings, designing optimizations for data movement, and evaluating new technologies. | ServePost-train | 8 |
| Deeplearning Software Engineer -- Neural 3D reconstruction Software Engineer role focused on deep learning for neural 3D reconstruction, involving research, design, implementation, optimization, and deployment of DNN models. The role requires C++, PyTorch, and ML/DL techniques, with a preference for experience in DNN development and network acceleration. | ServePost-train | 8 |
| Senior Manager, Deep Learning Performance Architecture NVIDIA is seeking an Engineering Manager to lead a Deep Learning Performance Architect Team. This role involves managing a team focused on analyzing deep learning networks and advancing deep learning computing systems through hardware/software co-design. Responsibilities include establishing team objectives, collaborating with software framework and hardware architecture teams, characterizing deep learning workloads, performance tuning, optimizing software stacks, and driving the evolution of next-generation hardware and software architectures. | Serve | 8 |
| Deep Learning Performance Architect NVIDIA is seeking Software Engineers to join their Deep Learning Inference team, focusing on developing and optimizing GPU-accelerated deep learning kernels for inference. The role involves performance analysis, tuning, and collaboration with cross-functional teams on innovative solutions. | Serve | 8 |
| Senior System Software Architect, HPC and AI Networking NVIDIA is seeking a Senior System Software Architect to design and prototype scalable software systems for distributed AI training and inference, focusing on optimizing throughput, latency, and memory efficiency. The role involves developing and evaluating communication libraries, collaborating with AI framework teams, co-designing hardware features for AI acceleration, and contributing to runtime systems and protocol layers. | ServePost-train | 8 |
| Software Engineer, LLM Inference Software Engineer focused on developing and optimizing LLM inference software and frameworks, working with GPU-accelerated libraries and deep learning frameworks. | Serve | 8 |
| Compute Architecture Software Engineer NVIDIA is seeking an LLM Inference Software Engineer to accelerate LLM inference using GPU technology on the TRTLLM project. The role involves developing and optimizing software solutions, implementing GPU-based algorithms, and improving performance across diverse computing environments. | Serve | 8 |
| Software Engineer, cuDNN - Deep Learning Software Engineer role focused on developing and optimizing cuDNN, a GPU-accelerated library for deep neural networks, including LLM support. The role involves performance analysis, tuning, and collaboration with cross-functional teams to innovate across various AI applications. | Serve | 8 |
| Deep Learning Performance Architect, CUTLASS DSL NVIDIA is seeking an engineer to develop and optimize CUTLASS DSL, a Python-native language for GPU kernel development, and its associated MLIR dialects and lowering passes. The role involves accelerating kernel compilation for NVIDIA's next-generation AI platforms, aiming for performance comparable to CUTLASS C++. | Serve | 7 |
| Deep Learning Performance Architect NVIDIA is seeking a Deep Learning Performance Architect to optimize deep learning hardware and software architecture, analyze performance of deep learning algorithms on different architectures, identify bottlenecks, and explore new features and hardware capabilities. Requires a strong background in computer architecture and experience with deep learning platforms and frameworks. | Serve | 7 |
| Deep Learning Compiler Engineer - CUDA NVIDIA is seeking a Deep Learning Compiler Engineer to design and implement DSLs and compiler cores for emerging GPU architectures, focusing on optimizing performance for AI/LLM workloads and integrating with AI/ML frameworks. | Serve | 7 |
| Developer Technology Engineer, AI NVIDIA Developer Technology Engineer focused on optimizing AI and deep learning applications on GPU architectures, working with customers to provide AI solutions, and collaborating with internal teams to influence future hardware and software design. | Serve | 7 |
| Senior System Software Engineer - AI Performance and Efficiency Tools NVIDIA is seeking a Senior System Software Engineer to develop tools for AI researchers and SW/HW teams running AI workloads on GPU clusters. The role involves building internal profiling, analysis, debugging, benchmarking, and simulation tools to improve the performance and efficiency of AI workloads and systems. This includes partnering with HW architects and understanding deep learning frameworks, distributed training/inference, and GPU cluster technologies. | ServeData | 7 |
| Senior Developer Relations Manager NVIDIA is seeking a Senior Developer Relations Manager to engage with the China industrial and research community, focusing on integrating GPU-accelerated computing solutions, particularly in Generative AI, Agentic AI, and AI Storage. The role involves understanding community requirements, promoting NVIDIA tools, architecting solutions, and driving adoption of new products within the AI storage ecosystem. | ServeAgent | 7 |
| Developer Technology Engineer – AI NVIDIA Developer Technology Engineer focused on optimizing deep learning and machine learning workloads on NVIDIA's accelerated computing platform (GPU, CPU, DPU) for key customers. Requires strong C/C++ and CUDA experience, with an MS/PhD in CS or related field. | Serve | 7 |
| Senior Computer Vision and Deep Learning Hardware Architect NVIDIA is seeking an Autonomous Vehicle Performance Architecture Engineer to design, model, and verify state-of-the-art programmable vision accelerators (PVA) for automotive and robotics. The role involves optimizing software for autonomous driving solutions, analyzing and prototyping applications, building performance models for future architectures, and collaborating with teams to enhance PVA architecture. Requires a Masters/PhD, 3+ years of relevant experience, strong C/C++ and computer architecture skills, and performance modeling/optimization expertise. Experience in DSP programming, autonomous vehicle software, deep learning, computer vision, and self-driving cars is a plus. | ServePost-train | 7 |
| Senior Software Engineer, NCCL Senior Software Engineer role focused on designing, implementing, and maintaining highly-optimized communication runtimes for Deep Learning frameworks and HPC programming interfaces on GPU clusters. This involves system software development, parallel programming interface contributions, and proof-of-concept creation for new designs and hardware features. | Serve | 7 |
| Solution Architect – Accelerated Computing Libraries NVIDIA is seeking a Solution Architect to drive the adoption of their AI and accelerated computing libraries across industries. The role involves understanding customer workloads, designing solutions using NVIDIA libraries for LLM inference and training acceleration, and collaborating with product teams to improve features and performance. The candidate will also build technical assets and analyze industry trends. | Serve | 7 |
| Senior Deep Learning Test Development Engineer, SDET Senior Deep Learning Test Development Engineer (SDET) at NVIDIA's AI SWQA team, responsible for validating the robustness and performance of NVIDIA's AI software and GPU Infrastructure across various AI scenarios. The role involves test planning, design, execution, automation, and bug management, with a focus on improving workflow processes and efficiency. Experience with LLM inference frameworks and AI development tools is required. | Serve | 7 |
| Senior Software Test Development Engineer - Deep Learning NVIDIA is seeking a Senior Software Test Development Engineer for its AI SWQA team. This role involves defining, developing, and executing tests to validate the robustness and performance of NVIDIA's AI software and GPU infrastructure across various AI applications like autonomous driving, healthcare, and NLP. The engineer will collaborate with AI product teams, develop complex test plans, manage bug lifecycles, and automate test cases for CI/CD pipelines. The position requires a Master's degree, 5+ years of QA/test automation experience, strong Python skills, and direct experience with AI tools/products or using AI for major features. Experience with AI for QA automation and deep learning frameworks is a plus. | Serve | 7 |
| Senior Solutions Architect, GPU System NVIDIA is seeking a Senior Solutions Architect with expertise in GPU server platforms and AI infrastructure to help customers design, deploy, and optimize NVIDIA-based AI factories. The role involves leading presales and architecture engagements, designing end-to-end AI data center solutions, and supporting the deployment of NVIDIA platforms for LLM training and inference workloads. | ServeAgent | 7 |
| Solution Architect - Top AI Labs Solution Architect role focused on designing AI computing platform architectures and supporting top AI Labs and model builders in integrating NVIDIA technologies for Deep Learning, HPC, Robotics, and Signal Processing applications. Requires experience with ML, data analytics, computer vision, and parallel programming on cloud/HPC architectures. | Serve | 7 |
| Devtech Compute Engineer NVIDIA is seeking a Devtech Compute Engineer to develop performance-critical code for deep learning applications, focusing on accelerating model training and inference on GPUs, particularly for recommender systems. The role involves optimizing CUDA kernels, integrating solutions into open-source libraries, and collaborating with hardware teams to define future solutions across various domains like LLM, Recsys, Robotics, and Assisted Driving. | ServeData | 7 |
| Senior System Software Architect, AI and GPU Networking This role focuses on architecting and enhancing NVIDIA's GPU Networking offerings to accelerate AI workloads, including distributed AI, deep learning, inference, and model serving. It involves co-designing hardware features and leading the architecture and design of new technologies for AI data centers. | ServePost-train | 7 |
| Senior Developer Technology Engineer This role focuses on optimizing GPU-accelerated code for training and inference performance of large-scale recommender systems. It involves designing and implementing high-performance C++/CUDA components, developing tests, and optimizing data flows between GPUs, NICs, and SSDs. The ideal candidate has experience with C++, CUDA, Python, GPU performance profiling, and ideally, building or optimizing recommender systems or production ML workloads on GPUs. | ServeShip | 7 |
| HPC and AI Cluster Engineer NVIDIA is seeking an HPC and AI Cluster Engineer to manage and maintain large-scale HPC/AI clusters, including Linux job scheduling, CI/CD pipelines, and troubleshooting from bare metal to application level. The role involves supporting R&D activities and POCs, working with cutting-edge hardware and software, and collaborating with researchers and customers to develop solutions. | Serve | 7 |