Currently tracking 440 active AI roles, down 53% versus the prior 4 weeks. Primary focus: Serve · Engineering. Salary range $100k–$575k (avg $262k).
NVIDIA currently has 496 active AI-related job listings. The majority of these roles, 52%, are focused on serving infrastructure, with agents representing another significant segment at 23%. Engineering is the dominant function, with 441 positions. The United States leads hiring geographies with 287 roles, followed by China with 64. Frequent tech tags include model_serving, inference_infra, and agent_orchestration, suggesting a focus on deployment and management of AI models. Over the last 30 days, NVIDIA posted 214 new AI roles, a 27% decrease compared to the previous 30-day period.
NVIDIA currently has 487 active AI-related roles in our index. The most common open titles are: Deep Learning Performance Architect (4), Senior Deep Learning Performance Architect (4), AI Research Scientist (3), Developer Technology Engineer - AI (3), Manager, Deep Learning Algorithms (3). Most positions are in Engineering and Research.
NVIDIA's active AI hiring is concentrated in: serving infrastructure (54%), agents (21%), application (8%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
NVIDIA is hiring AI talent in: United States (286 roles), China (59 roles), Israel (50 roles), Germany (21 roles).
Job postings at NVIDIA most frequently reference: model serving, inference infra, agent orchestration, llm observability, multimodal.
In the past 30 days, NVIDIA has posted 110 new AI-related roles. That is a -50% change versus the prior 30 days (218 → 110).
| Title | Stage | AI score |
|---|---|---|
| Senior Software Engineer, Deep Learning Inference Senior Software Engineer focused on optimizing deep learning inference for LLMs and omnimodal architectures on NVIDIA hardware, including GPU kernel tuning, distributed inference, and contributing to open-source libraries. | Serve | 8 |
| Senior Hardware Architect, Deep Learning GPU and System Senior Hardware Architect role focused on designing next-generation GPUs and systems to advance the state of AI, analyzing deep learning workloads, and proposing new features for acceleration. Requires 8+ years of experience in performance, hardware architecture, and deep learning analysis. | Serve | 8 |
| Senior Software Engineer - VLM Microservices for Neural Reconstruction |
| ServePost-train |
| 8 |
| AI Computing Software Development Engineer, TensorRT NVIDIA is seeking an AI Computing Software Development Engineer for its TensorRT team to craft and develop robust inferencing software for GPUs, focusing on performance analysis, optimization, and tuning. The role involves collaborating with various teams to guide machine learning inferencing direction and potentially publishing key results. | Serve | 8 |
| Senior Solutions Architect, Generative AI Senior Solutions Architect role focused on customer engagements for NVIDIA's generative AI technologies, involving AI model training and deployment optimization, particularly for LLMs and recommenders in the consumer internet industry. Requires strong coding, GPU optimization, and communication skills. | ServeData | 8 |
| Principal AI and ML Infra Software Engineer, GPU Clusters This role focuses on enhancing the efficiency of AI and ML research on GPU clusters by collaborating with researchers to identify and address infrastructure deficiencies. The engineer will optimize performance, monitor resource utilization, and contribute to the AI/ML infrastructure ecosystem, keeping up-to-date with the latest AI/ML technologies. | Serve | 8 |
| Principal Cloud Services Software Engineer NVIDIA DGX Cloud Team is seeking a Principal Cloud Services Software Engineer to develop and optimize AI infrastructure services for large-scale AI training workflows. The role involves designing and implementing resilient, efficient services orchestrated by Kubernetes, with a focus on backend development, distributed systems, and high-performance computing. | ServeAgent | 8 |
| Senior Deep Learning Software Engineer - Autonomous Vehicles Senior Deep Learning Software Engineer focused on developing and productizing deep learning solutions for autonomous vehicles. The role involves training, fine-tuning, optimizing perception DNNs, applying quantization, improving DNN architectures, and enhancing inference speed and power consumption. It requires strong programming skills, experience with deep learning frameworks, computer vision tasks, and familiarity with CNNs and Transformer architectures. Experience with low precision inference, quantization, and NVIDIA software libraries is a plus. | ServePost-train | 8 |
| Compiler Engineer - AI Inference NVIDIA is seeking an AI Compiler Engineer to optimize kernel generation and computational graph optimizations for AI inference and training workloads on next-generation GPUs. The role involves hands-on development, collaboration on hardware/software co-design, and scaling AI deployments in datacenters. | ServePost-train | 8 |
| Senior Software Engineer, Metropolis Vision AI Senior Software Engineer to develop and optimize high-performance Vision AI pipelines and large-scale distributed services for processing video, image, and 3D data. The role involves crafting real-time systems, developing multi-modal perception, using simulation/synthetic data, and profiling/tuning GPU-accelerated inference pipelines. Collaboration with research and platform teams is key, with an emphasis on bringing research into production at scale. | ServePost-train | 8 |
| Senior Software Engineer, AI Networking Senior Software Engineer role focused on building and productizing ML tools for optimizing AI workloads (LLM training/inference) across GPU/CPU clusters, with a focus on networking and system resource utilization. Involves distributed deep learning, ML-based optimization techniques, and performance analysis. | ServeAgent | 8 |
| Machine Learning Intern - 2026 NVIDIA is seeking a Machine Learning Intern to assist with AI technology development and demonstrations. The intern will work with NVIDIA SDKs, engage with the AI community, and contribute to machine learning projects and AI software development. | Serve | 8 |
| Deep Learning Architect, LLM Inference - New College Grad 2026 The role focuses on optimizing LLM inference server performance, workload characterization, and benchmarking for NVIDIA's GPUs. It involves collaborating with AI startups, developing performance tools, contributing to deep learning software projects, and guiding inference serving direction. | Serve | 8 |
| Senior Performance Engineer - LLM Inference Frameworks NVIDIA is seeking a Senior Performance Engineer to optimize LLM inference infrastructure on GPUs, focusing on throughput, memory efficiency, and scalability. The role involves designing and implementing high-performance pipelines, profiling, tuning model execution, and innovating techniques like Speculative Decoding and quantization. Experience with deep learning frameworks and performance debugging is required. | Serve | 8 |
| OEM Solutions Architect - AI Full Stack Public Sector NVIDIA is seeking a Solutions Architect to be the lead technical authority for Federal partnerships, focusing on deploying Generative AI at scale for U.S. Government agencies. The role involves architecting and optimizing the 'AI Factory,' leading POCs for NVIDIA's AI software stack, and navigating complex Federal security frameworks. The ideal candidate has extensive experience in full-stack data center architecture, the AI lifecycle (data curation, fine-tuning, inference orchestration), and strategic communication with both technical and leadership audiences within the public sector. | ServePost-train | 8 |
| AI Computing Development Engineer, TensorRT-LLM NVIDIA is seeking software engineers to develop and optimize inferencing software for AI models, specifically focusing on TensorRT-LLM. This role involves performance analysis, tuning, and collaboration across teams to advance machine learning inferencing capabilities. | Serve | 8 |
| Senior Software Engineer, JAX Senior Software Engineer to develop NVIDIA's AI platform, focusing on performance optimizations in deep learning frameworks using JAX. The role involves designing and implementing JAX core components, driving performance on NVIDIA products, and building tools to increase efficiency for AI-based systems. | Serve | 8 |
| Senior Architect - Server Performance NVIDIA is seeking architects to drive architectural performance for its next-generation AI server systems. This position demands a unique capability to bridge deep architectural knowledge, workload analysis, and hands-on silicon investigations. Candidates should be adept at working directly with silicon, high-level models, and simulators. Responsibilities include conducting performance investigations on both NVIDIA and competitive platforms, and developing targeted microbenchmarks to examine specific architectural aspects. The role does not heavily involve modeling tasks (functional or performance), though occasional focused assignments may arise. | Serve | 8 |
| Solutions Architect, Inference Deployments This role focuses on building and deploying AI inference solutions at scale using NVIDIA's GPU technology and Kubernetes. The Solutions Architect will collaborate with engineering, DevOps, and customers to optimize and serve generative AI models, ensuring low-latency inference in enterprise environments. | Serve | 8 |
| Senior Solutions Architect, Generative AI Senior Solutions Architect role focused on customer engagements, improving AI workload performance, and developing proof-of-concepts for Generative AI solutions (LLMs, recommenders) using NVIDIA software and technologies. Requires strong coding, GPU optimization, and communication skills. | ServeAgent | 8 |
| Principal Deep Learning Communication Architect NVIDIA is seeking a Principal Deep Learning Communication Architect to lead the technical roadmap for communication libraries across next-generation platforms, ensuring seamless scaling of models to massive clusters. The role involves designing and optimizing communication primitives for heterogeneous interconnects, co-designing with application developers and silicon architects, and developing analytical models for system behavior. Expertise in parallel computing, HPC/distributed deep learning, inference engines, and GPU architecture is required. | ServeAgent | 8 |
| Developer Technology Engineer - AI NVIDIA Developer Technology Engineer focused on optimizing AI workloads, particularly large language models (LLMs), on NVIDIA's GPU platform. The role involves deep dives into application performance, GPU kernel optimization, distributed training and inference, and collaboration with various internal teams and external developers. It requires strong software engineering skills, parallel programming expertise, and a focus on performance analysis and tuning. | ServePost-train | 8 |
| Senior AI Software Development Engineer, TensorRT-LLM NVIDIA is seeking a Senior AI Software Development Engineer for its TensorRT-LLM team. The role involves crafting and developing robust, scalable inference software for LLMs, focusing on performance analysis, optimization, and tuning. The engineer will write high-quality C++/Python code for the core backend software and collaborate with various teams to guide deep learning inference direction. A strong background in software development, LLM inference techniques, and deep learning frameworks is required. | Serve | 8 |
| Senior Product Manager, AI Inference - Dynamo Product Manager for NVIDIA Dynamo, a distributed inference framework for LLMs and Generative AI. Focuses on defining the roadmap for high-scale serving, optimizing hardware-software co-design, and developing agentic inference capabilities. Collaborates with engineering, open-source communities, and customers to integrate model evaluation into workflows. | ServeAgent | 8 |
| AI and FSI Developer Technology Engineer - New College Grad 2026 NVIDIA is seeking an AI and FSI Developer Technology Engineer to optimize AI and HPC workloads on NVIDIA GPUs and CPUs, focusing on performance tuning and eliminating bottlenecks for financial markets. The role involves research, development, analysis, and collaboration with experts to improve performance across the stack, from algorithms to kernels. The engineer will also publish and present their work and influence future hardware/software designs. | Serve | 8 |
| Senior Solutions Architect - KV Cache and AI Storage Senior Solutions Architect focused on building LLM inference platforms using NVIDIA GPUs, KV cache, and tiered memory solutions. The role involves technical exploration with customers, performance analysis, and translating customer needs into product roadmaps. | Serve | 8 |
| Solutions Architect - Top AI Labs Solutions Architect role at NVIDIA focusing on optimizing LLM inference and training acceleration, contributing to open-source frameworks like SGLang and vLLM, and developing KV cache offloading. Requires strong programming, systems fundamentals, and experience in performance analysis. | ServePretrain | 8 |
| Solutions Architect, Generative AI - CSP NVIDIA is seeking an AI-focused Solutions Architect with expertise in LLMs, generative AI, agentic AI, or recommender systems. The role involves providing technical expertise to customers, assisting with GPU infrastructure for AI, optimizing training and inference pipelines, and gathering customer feedback for product development. This position requires 3+ years of experience in AI for large models and proficiency with AI tools. | ServePost-train | 8 |
| Senior Deep Learning Solution Architect Senior Deep Learning Solution Architect at NVIDIA, focusing on LLM inference and training acceleration, performance optimization, and contributing to open-source frameworks like SGLang and vLLM. The role involves developing and optimizing inference frameworks, KV cache offloading, and exploring distributed training performance. | ServePost-train | 8 |
| Senior SOC Product Architect Physical AI Platforms This role focuses on architecting physical AI platforms for automotive and robotics, specifically defining the SoC architecture for embedded computer vision and AI systems. The individual will analyze use cases, map requirements to hardware/software features, define system requirements, and drive recommendations into product roadmaps. The role involves deep benchmarking, customer interaction, technical leadership, and mentorship, with a strong emphasis on functional safety (ISO 26262, SOTIF). | Serve | 8 |
| Deep Learning Algorithms Engineer - ACOT NVIDIA is looking for an AI Acceleration & Optimization Engineer to optimize the performance, scalability, and efficiency of AI models (LLMs, VLMs, diffusion, multimodal) on NVIDIA GPU platforms. The role involves profiling, identifying bottlenecks, and applying optimization techniques like quantization and kernel fusion, using tools such as CUDA, TensorRT, and Nsight. Collaboration with various teams (algorithms, systems, hardware, research, CUDA, compiler, frameworks) is key to bringing models from research to production. | ServePost-train | 8 |
| Senior Machine Learning Applications and Compiler Engineer, LPX NVIDIA is seeking a Senior Machine Learning Applications and Compiler Engineer to develop algorithms and optimizations for their LPX inference and compiler stack, working at the intersection of large-scale systems, compilers, and deep learning to map neural network workloads onto future NVIDIA platforms. | Serve | 8 |
| Senior Power Analysis and Optimization Engineer Senior Engineer to apply AI/ML and LLMs to power analysis and optimization for NVIDIA's GPUs and SoCs. Focus on developing and productionizing ML/RL models and custom LLMs to improve energy efficiency, interpret power data, and recommend optimizations. Involves RTL analysis, Verilog prototyping, and automation. | ServeData | 8 |
| Senior System Software Engineer – Embedded AI Inference Senior Software Engineer to develop production automotive software for AI inference and agent orchestration in C++ for embedded platforms. Focus on building next-generation automotive software applications, including in-car agentic AI and inference of LLM/VLM/VLA models on NVIDIA GPUs. | ServeAgent | 8 |
| Senior Machine Learning Applications and Compiler Engineer, LPX Develops algorithms and optimizations for NVIDIA's LPX inference and compiler stack, focusing on mapping neural network workloads onto future NVIDIA platforms and optimizing end-to-end inference performance. Requires strong software engineering, compiler/runtime development, and deep learning framework experience. | Serve | 8 |
| DL System Software Engineer - AI Platform NVIDIA is seeking a DL System Software Engineer to develop an AI Platform for efficient inference and training of large-scale models on GPU clusters. The role involves designing and building solutions for scheduling workloads, resource management, and performance optimization, working with various NVIDIA AI technologies. | ServePost-train | 8 |
| Senior Software Engineer, TensorRT-LLM NVIDIA is seeking a Senior Software Engineer for its TensorRT-LLM team to develop and scale inferencing software for LLMs and Generative AI. The role involves crafting robust inferencing software, performing benchmarking and profiling for GPU applications, writing high-quality Python code for LLM inference, and improving the TensorRT-LLM library. Collaboration with software, research, and product teams is key. | Serve | 8 |
| Senior Software Engineer – TensorRT Edge-LLM Senior Software Engineer to develop and optimize a state-of-the-art inference framework for Large Language, Vision-Language, and Multimodal models on edge and embedded platforms, focusing on real-time performance and constrained environments. | Serve | 8 |
| Senior Performance Engineer - Deep Learning Senior Performance Engineer at NVIDIA focused on optimizing Deep Learning models and frameworks (PyTorch, JAX) for NVIDIA GPUs. The role involves building and supporting Transformer Engine, collaborating on systems research for performance improvements, implementing and benchmarking new DL models, contributing to MLPerf, and engaging with the open-source community and enterprise customers. It also involves influencing future hardware and software design. | ServePost-train | 8 |
| Senior Software Engineer, Quantized Inference Senior Software Engineer focused on optimizing quantized inference for LLMs by implementing recipes, developing kernels, and collaborating on inference engines like vLLM and TRT-LLM. The role involves model export pipelines, benchmarking, and data analysis tooling. | Serve | 8 |
| Senior Compiler Engineer, AI Inference Performance NVIDIA is seeking a Senior Compiler Engineer to optimize AI inference performance for their Deep Learning & AI Compiler (DLC) team. The role involves analyzing deep learning networks, developing compiler optimization algorithms, and collaborating with framework and architecture teams to accelerate next-generation deep learning software for various AI applications. | Serve | 8 |
| Senior Compiler Engineer, AI Inference Platforms NVIDIA is seeking a Senior Compiler Engineer to join its Deep Learning & AI Compiler (DLC) team. The role involves analyzing deep learning networks, developing compiler optimization algorithms, and collaborating with framework and architecture teams to accelerate AI inference performance on NVIDIA GPUs. The compiler is critical for data centers, personal devices, automotive, and robotics, focusing on inference performance, build time, memory footprints, and ease of use. | Serve | 8 |
| Solutions Architect - CPU and LPU NVIDIA Solutions Architect focused on optimizing AI inference workloads across CPU, GPU, and LPU platforms for customers. The role involves technical expertise, proof-of-concept development, and optimizing AI efficiency in heterogeneous environments. | ServeAgent | 8 |
| Principal AI Developer Technology Engineer This role focuses on researching and developing techniques to accelerate AI workloads (deep learning, machine learning) on advanced computer architectures, specifically GPUs. The engineer will perform in-depth analysis and optimization of complex AI and HPC algorithms, publish findings, and influence future hardware/software design. Requires deep C/C++ programming, parallel programming (CUDA, etc.), low-level performance optimization, and CPU/GPU architecture expertise. | Serve | 8 |
| Principal AI Developer Technology Engineer Seeking a Principal Developer Technology Engineer to research and develop techniques for GPU acceleration of AI workloads, focusing on performance optimization of deep learning and HPC algorithms on modern CPU and GPU architectures. This role involves collaborating with internal teams and the developer community, influencing hardware/software design, and publishing findings. | Serve | 8 |
| Solution Architect, Energy NVIDIA is seeking a Solution Architect with deep expertise in AI solutions to drive the efficient use of compute platforms in the Energy Industry. The role involves being a trusted technical advisor to developers and customers, embedding NVIDIA software, improving application performance, and establishing the foundation for next-generation AI systems. Responsibilities include supporting business development, working directly with customers, assisting in the adoption of NVIDIA software, analyzing architectures for acceleration opportunities, providing feedback to engineering teams, and delivering trainings and demonstrations. Requires an MS/PhD in a technical field, 5+ years of experience in AI/ML/DL/NLP/Generative AI, and 5+ years of industrial experience in power grid software and advanced ML for grid operations. Familiarity with accelerated computing, GPU systems, Python/C/C++, major AI frameworks, containers, and version control is essential. Experience designing and building complex AI/ML solutions, and reasoning across various system components is also required. | Serve | 8 |
| Developer Relations Manager – AI Natives NVIDIA is seeking a Developer Relations Manager to engage with AI-native companies, helping them design, optimize, and scale their AI platforms on NVIDIA technologies. The role involves advising founders and engineering teams on building agentic systems, AI copilots, and multimodal applications, with a focus on accelerating training, optimizing inference, and delivering AI experiences. The ideal candidate has deep technical expertise in AI systems, developer platforms, and large-scale inference infrastructure. | ServeAgent | 8 |
| Senior AI Performance and Efficiency Engineer Senior AI/ML Performance and Efficiency Engineer focused on optimizing GPU cluster performance for AI/ML researchers by addressing infrastructure and application bottlenecks. This role involves building tools, analyzing efficiency, and collaborating across teams to improve hardware, software, and infrastructure usage for various ML workloads like Robotics, Autonomous vehicles, LLMs, and Videos. | Serve | 8 |
| Senior AI Developer Technology Engineer Senior Developer Technology Engineer focused on researching and developing techniques to GPU accelerate AI workloads, optimizing performance on modern CPU and GPU architectures, and collaborating with the developer community and internal teams to influence next-generation hardware and software design. | Serve | 8 |
| Engineering Manager, AI Developer Technology Engineering Manager for NVIDIA's AI Developer Technology team, focused on leading a team to optimize and develop algorithms for Deep Learning and Machine Learning applications, influencing next-generation hardware/software, and collaborating with customers and internal teams. The role involves optimizing training and inference performance on NVIDIA hardware. | ServePost-train | 8 |