AI Hire Signal
JobsCompaniesTrendsInsightsWeekly
JobsStrategy timeline
AI Hire Signal

Tracking AI hiring across 200+ US tech companies. Stage, salary, and stack signals on every role — refreshed weekly.

Contact

Browse

JobsCompaniesTrendsInsightsWeekly

Resources

AboutSitemapRobots

Legal

PrivacyTerms
© 2026 AI Hire Signal·Not affiliated with companies shown

Currently tracking 440 active AI roles, down 53% versus the prior 4 weeks. Primary focus: Serve · Engineering. Salary range $100k–$575k (avg $262k).

Hiring
440 / 623
Momentum (4w)
↓-386 -53%
340 opens last 4w · 726 prior 4w
Salary range · avg $262k
$100k–$575k
USD · disclosed roles only
Tracked since
May '25
last role 4w ago
Hiring velocityscroll left for older weeks
1 new role
Dec 30
1 new role
Mar 10
1 new role
24
1 new role
Apr 28
4 new roles
May 12
5 new roles
19
3 new roles
26
3 new roles
Jun 2
2 new roles
9
1 new role
16
2 new roles
23
3 new roles
30
4 new roles
Jul 7
1 new role
14
2 new roles
28
4 new roles
Aug 11
6 new roles
18
2 new roles
25
3 new roles
Sep 1
8 new roles
15
3 new roles
22
6 new roles
29
2 new roles
Oct 6
2 new roles
13
3 new roles
20
6 new roles
27
9 new roles
Nov 3
8 new roles
10
8 new roles
17
4 new roles
24
11 new roles
Dec 1
9 new roles
8
14 new roles
15
10 new roles
22
8 new roles
29
107 new roles
Jan 5
22 new roles
12
45 new roles
19
32 new roles
26
59 new roles
Feb 2
64 new roles
9
63 new roles
16
83 new roles
23
83 new roles
Mar 2
88 new roles
9
97 new roles
16
72 new roles
23
215 new roles
30
158 new roles
Apr 6
250 new roles
13
199 new roles
20
332 new roles
27
304 new roles
May 4
189 new roles
11
131 new roles
18
102 new roles
25
129 new roles
Jun 1
122 new roles
8
49 new roles
15
40 new roles
22

NVIDIA currently has 496 active AI-related job listings. The majority of these roles, 52%, are focused on serving infrastructure, with agents representing another significant segment at 23%. Engineering is the dominant function, with 441 positions. The United States leads hiring geographies with 287 roles, followed by China with 64. Frequent tech tags include model_serving, inference_infra, and agent_orchestration, suggesting a focus on deployment and management of AI models. Over the last 30 days, NVIDIA posted 214 new AI roles, a 27% decrease compared to the previous 30-day period.

Auto-generated from active job postings · last refreshed 2026-05-24

Frequently asked questions

  • What AI roles is NVIDIA hiring for?

    NVIDIA currently has 487 active AI-related roles in our index. The most common open titles are: Deep Learning Performance Architect (4), Senior Deep Learning Performance Architect (4), AI Research Scientist (3), Developer Technology Engineer - AI (3), Manager, Deep Learning Algorithms (3). Most positions are in Engineering and Research.

  • What stage of AI development does NVIDIA focus on?

    NVIDIA's active AI hiring is concentrated in: serving infrastructure (54%), agents (21%), application (8%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.

  • Where is NVIDIA hiring AI talent?

    NVIDIA is hiring AI talent in: United States (286 roles), China (59 roles), Israel (50 roles), Germany (21 roles).

  • What technologies does NVIDIA's AI team work with?

    Job postings at NVIDIA most frequently reference: model serving, inference infra, agent orchestration, llm observability, multimodal.

  • How many AI roles has NVIDIA posted recently?

    In the past 30 days, NVIDIA has posted 110 new AI-related roles. That is a -50% change versus the prior 30 days (218 → 110).

Jobs (439)

434 AI · 1824 total active
FilteredCountryUnited States×
Show
Active onlyAI only (≥ 7)
Stage
AllData · 28Pretrain · 30Post-train · 51Serve · 356Agent · 192Eval Gate · 11Ship · 55
Function
AllEngineering · 627Research · 82Product · 14
Country
AllUnited States · 439China · 93Israel · 54Germany · 36Switzerland · 31India · 26United Kingdom · 24Poland · 17Vietnam · 13Canada · 12Singapore · 11France · 10Netherlands · 9Italy · 8Taiwan · 6Hong Kong · 4Japan · 4Spain · 3Australia · 2Czech Republic · 2Finland · 2Hungary · 2South Korea · 2Armenia · 1Brazil · 1Mexico · 1Romania · 1Saudi Arabia · 1Sweden · 1United Arab Emirates · 1
Sort
AI scoreRecentTitle
TitleStageFunctionLocationFirst seenAI score
Senior Deep Learning Software Engineer
Senior Deep Learning Software Engineer to design and build an automated inference and deployment solution with a scalable architecture focusing on ease-of-use and compute efficiency. The role involves developing features in high-level frameworks, implementing a high-performance execution environment, and low-level GPU optimizations.
ServeEngineeringSanta Clara, CA +1Apr 249
Principal Architect, AI Networking
This role leads the research agenda and architectural direction for NVIDIA's AI networking systems, focusing on high-performance communication at scale. It involves original research, hardware-software co-optimization, and integrating networking into AI serving stacks, with a requirement to publish findings and ship production-grade software.
ServePretrain
51–100 of 439← Prev123…9Next →
Research
Santa Clara, CA +4 · Remote
Apr 23
9
Senior Software Engineer, RL Post-Training Frameworks
NVIDIA is seeking a Senior Software Engineer to build and scale RL post-training infrastructure, focusing on distributed systems, high-performance computing, and deep learning infrastructure. The role involves architecting and optimizing RL training-inference-rollout loops, ensuring fault tolerance and elastic scaling, and collaborating with researchers and hardware teams.
Post-trainServeEngineeringSanta Clara, CA +1 · RemoteApr 239
Manager, Deep Learning – Autonomous Vehicles and Robotics
Manager for a Deep Learning Engineering team focused on delivering production-quality deep learning solutions for autonomous vehicles and robotics on edge hardware. The role involves leading a team, defining technical initiatives, and collaborating with automotive OEMs and robotics partners to optimize solutions on NVIDIA platforms, working at the intersection of model architectures, compiler technology, and embedded deployment.
ServePost-trainEngineeringSanta Clara, CAApr 229
Senior AI Software Engineer, Kernel Libraries
Senior AI Software Engineer focused on developing kernel libraries and inference systems software to accelerate AI workloads, including LLMs and agents, on NVIDIA's hardware. Responsibilities include innovating and optimizing kernels, designing abstractions for serving engines, and building compilers/runtimes.
ServeEngineeringSanta Clara, CA +1 · RemoteApr 229
Senior Software Engineer, AI and DL Kernel Libraries
Develops libraries, code generators, and GPU kernel technologies for NVIDIA's AI inference systems software stack, focusing on accelerating AI inference through efficient kernels, abstractions, and runtimes for LLMs and agents.
ServeEngineeringSanta Clara, CA +7 · RemoteApr 229
Senior AI Compiler Engineer, MLIR
NVIDIA is hiring a Senior AI Compiler Engineer to build an MLIR-based AI compiler for their inference engine, focusing on performance, low memory usage, and usability across data center and edge. The role involves developing graph representations, optimizations, defining APIs, and implementing compiler optimizations and kernel generation for neural networks.
ServeEngineeringSanta Clara, CA +5 · RemoteApr 229
Senior DL Software Engineer, Model Optimization and Edge Deployment - Autonomous Vehicles
Senior DL Software Engineer focused on optimizing and deploying large multimodal models (LLMs/VLMs) for real-time robotic execution in autonomous vehicles. The role involves advanced model compression, quantization, pruning, distillation, and inference optimization techniques for edge deployment on NVIDIA hardware, integrating with C++ production environments.
ServeAgentEngineeringSanta Clara, CAApr 219
Solutions Architect, AI Models
NVIDIA is seeking a Solutions Architect to help enterprise customers adopt NVIDIA AI software and models. This role involves developing end-to-end AI solutions, tackling complex challenges across the AI model lifecycle (data processing, orchestration, training, post-training, RL, evaluation, optimization), and supporting a broad model portfolio. The architect will partner with customers to understand their needs and deliver customized AI solutions, contributing to product improvement and sharing knowledge through open-source projects, product engineering, or training.
ShipPost-trainEngineeringSanta Clara, CA +1 · RemoteApr 219
Senior Solutions Architect, Retail
Senior Solutions Architect for Retail at NVIDIA, focusing on developing and deploying Agentic AI solutions for enterprise clients. The role involves building complex agentic systems, RAG pipelines, and optimizing inference performance using NVIDIA's AI infrastructure. Requires strong programming skills, experience with LLM applications, and agentic frameworks.
AgentEngineeringCA · RemoteApr 219
Senior Research Engineer - Video Search
Senior Research Engineer at NVIDIA to design and build video search technologies for Autonomous Vehicles, Robotics, and Medical applications, focusing on exabyte scale and agentic search. The role involves developing and integrating innovative video search approaches, benchmarking retrieval methods, and collaborating with researchers and product teams to build robust physical AI dataset search workflows.
AgentDataEngineeringSanta Clara, CAApr 199
Senior Deep Learning Software Engineer, LLM Performance
Senior Deep Learning Software Engineer focused on optimizing LLM inference performance on NVIDIA accelerators using frameworks like TensorRT LLM, VLLM, and Triton. The role involves implementing and scaling inference, serving, and deployment algorithms, collaborating with various teams, and contributing to NVIDIA/OSS LLM frameworks.
ServeEngineeringSanta Clara, CAApr 169
Senior Solutions Architect, Generative AI Specialist
Senior Solutions Architect specializing in Generative AI, focusing on developing end-to-end AI solutions, reference architectures, and proof-of-concept engagements for agentic AI systems and LLM-powered workflows. The role involves designing multi-cloud strategies, leading workshops, and advising on MLOps principles and emerging standards for agentic AI.
AgentEngineeringSanta Clara, CAApr 159
Senior Machine Learning and Simulation Engineer - Autonomous Vehicles
Senior ML Engineer focused on building and optimizing large-scale Reinforcement Learning (RL) training frameworks for multi-modal Autonomous Vehicle (AV) foundation models. This role involves designing simulation and data processing pipelines, refining reward functions, and ensuring the reliability of training workflows on GPU clusters, with a focus on closed-loop simulation for training end-to-end AV models.
Post-trainAgentEngineeringSanta Clara, CAApr 159
Solutions Architect, Generative AI
NVIDIA is seeking an AI Engineer or Solutions Architect to enable ecosystem partners for Generative AI. The role involves building innovative proof-of-concept solutions and reference architectures for AI agents, demonstrating NVIDIA's full-stack accelerated Generative AI platforms. Responsibilities include acting as a technical expert, developing foundational solutions, providing technical blueprints, advising on deployment, and enabling partners to build their own services and products. The role requires experience in deploying AI models at scale, building enterprise-grade agentic AI systems, and proficiency in LLM/VLM frameworks and Python/C++.
AgentServeEngineeringSanta Clara, CAApr 159
Senior ML Evaluation Engineer - Autonomous Vehicles
NVIDIA is seeking a Senior ML Evaluation Engineer for their Autonomous Vehicles team. The role involves designing and building learned evaluation pipelines using LLMs, VLMs, and agentic workflows to assess driving behavior. The engineer will define evaluation methodologies, build golden-set frameworks, and contribute to the transition from rule-based to learned evaluation systems. This position requires a strong background in ML system development, software engineering, and experience with large-scale data processing, with a focus on shipping production ML systems.
Eval GateAgentEngineeringSanta Clara, CA +4 · RemoteApr 159
Senior Software Engineer - AI Inference
Senior Software Engineer focused on optimizing and contributing to open-source LLM inference serving engines like vLLM and SGLang to run efficiently on NVIDIA GPUs, focusing on high-throughput, low-latency inference at scale.
ServeEngineeringSanta Clara, CA +3 · RemoteApr 149
Senior Solutions Architect, Autonomous Vehicles - Data Center
NVIDIA is seeking a Senior Solutions Architect for Autonomous Vehicles and Robotics to help customers accelerate Physical AI workloads using NVIDIA's full-stack technologies. The role involves engaging with customers to optimize training, simulations, and synthetic data generation for AV perception and planning models, providing technical expertise, and driving full-stack adoption. The candidate will analyze and optimize AI models for GPU performance, build collateral for various AI workflows, and provide technical leadership. Requires 8+ years of ML/DL Infra experience in AVs, proficiency in Python, CUDA/C++, Linux, DevOps tools, and a strong understanding of AV models and simulations. Experience with model deployment at scale and robotics model development is a plus. The role focuses on the data and infrastructure aspects of AI model development and deployment in the AV domain.
DataServeEngineeringSanta Clara, CAApr 149
Principal Engineer - AI Agents and Systems
Principal Engineer to lead the deployment of advanced AI agent frameworks and local runtimes on Windows and NVIDIA GPUs, focusing on open-source agents, local inference, privacy, and security for consumer PCs.
AgentServeEngineeringSanta Clara, CA +1Apr 139
Research Scientist, Generative AI for Physical AI - PhD New College Grad 2026
Research Scientist role focused on Generative AI for Physical AI, developing advanced video generative and video-language models, and scaling large-scale training systems for foundation models. Requires a PhD and expertise in PyTorch, diffusion, vision-language, reasoning models, RL, and physics simulation.
PretrainPost-trainResearchSanta Clara, CAApr 109
Senior Director - GenAI Data Strategy
Senior Director role focused on defining and executing a comprehensive data strategy for foundation models, encompassing multi-modal data acquisition, curation, synthetic generation, and alignment techniques like RLHF. This role bridges research insights with data collection to improve model performance and safety, and engages with customers to translate deployment gaps into data priorities.
DataPost-trainProductSanta Clara, CAApr 99
Senior Software Engineer - Agentic Memory
Senior Software Engineer role focused on developing and researching agentic memory systems, including designing benchmarks, generating synthetic data, running experiments, and contributing to open-source evaluation tools. The role involves partnering with other NVIDIA teams deploying agents and advancing the state of the art in agentic memory evaluation.
AgentEval GateEngineeringCA +4 · RemoteApr 89
Senior Machine Learning Engineer, Perception - Autonomous Driving
NVIDIA is seeking a Senior Machine Learning Engineer for their autonomous driving perception team. The role involves designing and developing end-to-end deep learning solutions for perception modules, focusing on road layout detection and other critical driving components. Responsibilities include applied research, data-driven development, and productizing solutions with a focus on safety, latency, and robustness. Experience with deep learning frameworks, Python/C++, and perception for autonomous driving or robotics is required.
ShipDataEngineeringSanta Clara, CA +2 · RemoteApr 89
Senior High-Performance LLM Training Engineer
NVIDIA is seeking an experienced Senior High-Performance LLM Training Engineer to optimize LLM training workloads on advanced computing systems. The role focuses on improving the efficiency of NVIDIA's high-performance LLM software stack using frameworks like PyTorch and JAX for training on thousands of GPUs, and influencing future hardware roadmaps.
DataEngineeringSanta Clara, CAApr 89
Senior Robotics Research Engineer, Robotics and AI for Drug Discovery
Senior Robotics Research Engineer focused on building physical AI for drug discovery labs, involving robotics simulation, perception, task and motion planning, and training robots for manipulation tasks using imitation and reinforcement learning.
AgentDataResearchSanta Clara, CAApr 89
Senior Solutions Architect, Generative AI Specialist
Senior Solutions Architect specializing in Generative AI, focusing on building and architecting enterprise-grade agentic AI systems, RAG pipelines, and multi-modal workflows. The role involves leading prototyping, proof-of-concept collaborations, and providing technical advisory to sophisticated AI partners, with a strong emphasis on GPU-accelerated inference at scale, production optimization, and creating reusable technical assets. Responsibilities include problem-solving across the AI stack, collaborating with internal teams, and contributing to team growth and practice building.
AgentServeEngineeringSanta Clara, CA +5Apr 79
Solutions Architect, LLM Model Builder
Solutions Architect focused on enabling partners to build, benchmark, fine-tune, optimize, and deploy foundation model solutions for customer workloads, with an emphasis on reasoning, multimodal, and production inference.
ServePost-trainEngineeringSanta Clara, CAApr 79
Solutions Architect, LLM Model Builder
Solutions Architect focused on enabling partners to build, benchmark, fine-tune, optimize, and deploy foundation model solutions for customer workloads, with a strong emphasis on production inference and reasoning/multimodal models.
ServePost-trainEngineeringSanta Clara, CAApr 79
Senior Solutions Architect, Generative AI Specialist
Senior Solutions Architect specializing in Generative AI, focusing on building and deploying enterprise-grade agentic AI systems, RAG pipelines, and multi-modal workflows with GPU-accelerated inference at scale. The role involves acting as a technical advisor, leading prototyping, architecting solutions, and resolving complex system issues for NVIDIA's advanced AI partners.
AgentServeEngineeringSanta Clara, CA +5 · RemoteApr 79
Solutions Architect, Applied AI Builder
This role focuses on building production-grade AI applications and agent systems for enterprises, involving design, orchestration, integration, observability, and deployment on NVIDIA's platforms. The candidate will lead by example as a hands-on developer, creating proof-of-concept solutions and deployable single-agent and multi-agent systems to solve real business problems.
AgentEngineeringSanta Clara, CAApr 79
Senior HPC and AI Networking Performance Research and Analysis Engineer
Research and analysis engineer focused on optimizing AI networking performance for large-scale LLM training on distributed GPU clusters, involving profiling, analysis, tool development, and collaboration across hardware and software teams.
PretrainServeResearchSanta Clara, CAApr 69
Senior Manager, Software Engineering - JAX
Senior Engineering Manager to define and drive NVIDIA's JAX strategy, coordinating multiple teams to ensure JAX delivers peak performance across heterogeneous hardware (GPUs, CPUs, LPUs). The role involves supporting emerging needs across training, post-training, inference, and robotics, bridging new hardware capabilities with AI trends. Key responsibilities include driving engineering contribution strategy, promoting teamwork, building partnerships with open-source projects, designing processes, and leading a high-performing engineering organization.
ServePost-trainEngineeringSanta Clara, CAApr 69
Deep Learning Software Engineer, TensorRT Performance - New College Grad 2026
NVIDIA is seeking a Deep Learning Software Engineer to analyze and improve the performance of their inference ecosystem, focusing on TensorRT and related frameworks. The role involves optimizing inference solutions for various NVIDIA accelerators, developing new model pipelines, and collaborating with cross-functional teams on generative AI, robotics, and vision/speech understanding applications.
ServeEngineeringSanta Clara, CA +1 · RemoteApr 49
Deep Learning Senior Engineer, End-To-End Autonomous Driving
NVIDIA is looking for a Deep Learning Senior Engineer to design, implement, and deploy end-to-end autonomous driving systems. The role focuses on leveraging LLMs, VLMs, and VLAs for reasoning and planning, involving model training, pre-training, fine-tuning, and integration into safety-critical vehicle firmware. Experience with production-grade ML models and C++ for deployment is required.
Post-trainAgentEngineeringSanta Clara, CA +1 · RemoteApr 49
Manager, Large Language Model Inference
Manager for Large Language Model Inference at NVIDIA, focusing on developing and optimizing LLM/VLM/VLA inference software for NVIDIA GPUs and hardware platforms. The role involves leading a team in specialized kernel development, runtime optimizations, and frameworks for LLM inference, with a strong emphasis on delivering production-grade, high-performance software.
ServeEngineeringSanta Clara, CA +1 · RemoteApr 49
Senior Deep Learning Software Engineer, TensorRT Performance
NVIDIA is seeking a Senior Deep Learning Software Engineer to analyze and improve the performance of their deep learning inference ecosystem, specifically focusing on TensorRT. The role involves optimizing inference solutions for various NVIDIA accelerators, contributing to inference frameworks, and developing new model pipelines for generative AI and other applications.
ServeEngineeringSanta Clara, CA +1 · RemoteApr 49
Senior Research Scientist
NVIDIA is seeking a Senior Research Scientist to join their applied research team focused on building next-generation Conversational AI systems. The role involves developing new Deep Learning models for ASR, speech synthesis, NMT, and NLP, designing large-scale training algorithms, and open-sourcing models using NeMo. The position requires a PhD, significant research experience in speech recognition or NLP, strong Python and PyTorch skills, and a proven publication record.
Post-trainPretrainResearchSanta Clara, CA +2 · RemoteApr 49
Senior Perception Engineer, Obstacle Foundation Models - Autonomous Vehicles
NVIDIA is seeking a Senior Perception Engineer to design and productize its next-generation autonomous driving perception stack. The role focuses on the core 3D obstacle perception pipeline, involving architecture and algorithm design, and hands-on implementation using transformer-based, multi-modal, and vision-language techniques. Responsibilities include developing perception models, building production-grade deep learning models with pretraining and fine-tuning, defining KPI frameworks, contributing to data strategy, and collaborating with safety and systems teams. Requires a PhD/MS/BS with significant relevant experience, proficiency in PyTorch, Python/C++, and experience in data-driven development. Experience with autonomous driving/robotics perception, embedded platforms, optimization, and publications in leading conferences are desirable.
ShipPost-trainEngineeringSanta Clara, CAApr 49
Principal Deep Learning Senior Engineer, End-To-End Autonomous Driving
NVIDIA is seeking a Principal Deep Learning Senior Engineer to design, implement, and deploy end-to-end autonomous driving systems. The role focuses on leveraging LLMs, VLMs, and VLAs for advanced reasoning and planning in vehicles and robotics, involving model training, pre-training, fine-tuning, and integration into safety-critical systems.
Post-trainAgentEngineeringSanta Clara, CA +1 · RemoteApr 49
Principal Deep Learning Engineer – Perception, Autonomous Driving
Principal Deep Learning Engineer for NVIDIA's Autonomous Driving Perception team, focusing on developing, training, and deploying state-of-the-art perception systems (detection, segmentation, tracking) for vehicles. The role involves leading the end-to-end productization of these models, ensuring high quality and safety, defining data strategy, and providing technical leadership. Requires extensive experience in deep learning for computer vision and shipping commercial DL products.
ShipServeEngineeringSanta Clara, CAApr 49
Senior Deep Learning and Computer Vision Engineer - Autonomous Vehicles
Senior Deep Learning and Computer Vision Engineer for Autonomous Vehicles team, focusing on applying state-of-the-art techniques to build ground truth, train deep neural networks, and develop training pipelines and real-time inference run-times for self-driving cars.
DataServeEngineeringSanta Clara, CA +2Apr 49
Machine Learning Engineer, GeForce G-Assist
Machine Learning Engineer at NVIDIA focused on building GeForce G-Assist, an on-device AI assistant. The role involves evaluating and improving SLMs and VLMs, optimizing local inference (e.g., llama.cpp), designing RAG systems, and supporting agentic AI workflows. Requires strong C/C++ and Python skills, experience with local inference frameworks, and knowledge of SLM/VLM architectures and agentic AI patterns.
AgentServeEngineeringSanta Clara, CAApr 49
Principal Perception Engineer, Obstacle Foundation Models - Autonomous Vehicles
Principal Perception Engineer at NVIDIA for Autonomous Vehicles, focusing on designing and productizing next-generation 3D obstacle perception stacks using deep learning, transformers, and multi-modal techniques. The role involves technical leadership, hands-on algorithm development, production-grade model development, data strategy, and collaboration with safety and systems teams for large-scale deployment.
AgentDataEngineeringSanta Clara, CAApr 49
Senior Deep Learning Communication Architect
Senior Deep Learning Communication Architect role focused on optimizing communication performance for large-scale distributed deep learning training and inference. This involves identifying bottlenecks, designing efficient protocols, collaborating on hardware/software co-design, and exploring new communication technologies. The role requires deep understanding of parallelism techniques and experience with DNN frameworks and GPU computing.
ServePost-trainEngineeringSanta Clara, CA +1Apr 49
Senior Deep Learning Performance Architect - LPU
NVIDIA is seeking a Senior Deep Learning Performance Architect to focus on hardware-software co-design for AI Inference performance. The role involves designing GPU and system architectures, analyzing deep learning algorithms, building performance models, and collaborating with various teams to guide AI direction.
ServeEngineeringCA +1 · RemoteApr 49
Senior Systems Software Engineer - Deep Learning Solutions
Senior Systems Software Engineer focused on optimizing deep learning inference for autonomous vehicles and robotics on edge devices. Requires deep understanding of model architectures, kernel trace analysis, and evaluation of modern architectures on GPUs/SOCs, with a focus on TensorRT and compiler technology for embedded hardware.
ServePost-trainEngineeringSanta Clara, CAApr 49
AI Inference Performance Engineer
This role focuses on optimizing and benchmarking Generative AI inference performance on NVIDIA's hardware accelerators, specifically working with frameworks like TensorRT-LLM, SGLang, and vLLM. The engineer will drive industry benchmark results by implementing optimizations in quantization, scheduling, memory management, and distributed inference. They will also define and optimize cutting-edge workloads, architect distributed inference systems from single-GPU to rack-scale, establish performance methodology using profiling, and contribute to open-source projects. The role requires strong programming skills (Python/C++), expertise in DL frameworks, and a deep understanding of LLM/VLM architectures and inference mechanics.
ServeEngineeringSanta Clara, CAApr 49
Senior Deep Learning Scientist, Multimodal Conversational AI
Senior Deep Learning Scientist role focused on developing, training, fine-tuning, and deploying streaming multimodal conversational AI systems. This includes speech, audio, vision, voice chat, and action, as well as human-AI interaction. The role involves applying research to define algorithmic improvements and scale them through the Nemotron platform, working on high-impact LLM products.
Post-trainAgentEngineeringSanta Clara, CAApr 49
Senior Deep Learning Engineer - Model Evaluation & AI Systems
Senior/Principal Deep Learning Engineer focused on building evaluation methodologies and infrastructure for AI models (LLMs, RAG, agents, vision/multimodal), including contributing to an open-source platform and collaborating with the community. The role involves working with model training, inference, and product teams to provide evaluation signals for release and optimization decisions.
Eval GateAgentEngineeringSanta Clara, CAApr 49
Senior Deep Learning Engineer
Senior Deep Learning Engineer at NVIDIA focused on optimizing inference for next-generation AI workloads including multi-agent systems and generative multimodal models. The role involves characterizing emerging workloads and developing novel optimization methods across the inference stack, from algorithmic to system level, on NVIDIA hardware. Collaboration with research, framework development, and silicon architecture teams is key.
ServeAgentEngineeringRedmond, WA +1Apr 49