Senior Manager, Engineering - Enterprise AI and Automation

NVIDIA · Semiconductors · Santa Clara, CA

Senior Engineering Manager to lead the strategy and execution for NVIDIA’s agentic developer platform, focusing on building, evaluating, and improving autonomous agents. The role involves identifying gaps, driving POCs, operationalizing approaches into reusable components, and establishing governance and safety mechanisms to scale autonomous systems within NVIDIA.

What you'd actually do

Track and deeply understand evolving agent development patterns across NVIDIA and the broader ecosystem
Identify gaps and friction in current agent architectures, and translate insights into a platform strategy that boosts developer velocity and agent quality—backed by evaluations, benchmarking, and feedback loops
Assess and integrate open source and third-party tools where they add leverage; drive clear build-vs-use decisions
Architect and integrate high-performance data pipelines, RAG systems, vector databases, and GPU-optimized training and inference workflows.
Lead integration of the AI Data Platform into NVIDIA’s on-prem AI Factory, optimizing GPU-to-storage throughput, data locality, and distributed inference performance.

Skills

Required

Bachelor’s degree in CS/Engineering or equivalent experience
10+ overall years in software engineering, including 4+ years managing high-performing teams
Strong hands-on experience with evolving agent architectures and open-source libraries; deep expertise in LLM/agent architectures—leading POCs and integrating them into real business use cases with measurable adoption/impact
Ability to turn fast-moving, ambiguous problem spaces into clear platform strategy, roadmap, and outcomes
Proven track record building multi-team developer platforms (APIs/SDKs, reusable components, reference implementations)
Experience building evaluation/benchmarking systems for agent workflows (metrics, regression, feedback loops)
Strong judgment integrating OSS/3P tools; clear build-vs-use decision-making and integration strategy
Product approach for safety and governance: controls, audit ability, monitoring, and risk management
Strong leadership and executive communication (engineering, product, security, research)

Nice to have

Experience implementing enterprise-grade governance for agent systems (controls, audit-ability, monitoring, policy enforcement) in production autonomous workflows
Demonstrated wins taking new/open-source agent constructs from POC to production adoption, with clear business impact (cycle time, quality, cost, reliability)
Built and scaled an agent platform or agent developer experience used by multiple teams (SDKs, templates, reference apps, reusable building blocks)
Clear point of view and real examples on build-vs-use decisions—when to adopt OSS/3P vs build internal primitives—and how to operationalize the choice
Deep experience with agent evaluation at scale (long-horizon tasks, tool correctness, reliability testing, automated regressions, offline/online feedback loops)

What the JD emphasized

deeply understanding how teams across the company build, evaluate, and improve autonomous agents
turning those evolving patterns into scalable platform capabilities
drive rapid proof-of-concepts on emerging agent constructs and ecosystem tools
operationalize the best approaches into reusable building blocks, integrations, and governance mechanisms
build a platform
platform helps teams safely ship more autonomous systems at NVIDIA scale
What we need to see:
Strong hands-on experience with evolving agent architectures and open-source libraries; deep expertise in LLM/agent architectures—leading POCs and integrating them into real business use cases with measurable adoption/impact
Ability to turn fast-moving, ambiguous problem spaces into clear platform strategy, roadmap, and outcomes
Proven track record building multi-team developer platforms (APIs/SDKs, reusable components, reference implementations)
Experience building evaluation/benchmarking systems for agent workflows (metrics, regression, feedback loops)
Strong judgment integrating OSS/3P tools; clear build-vs-use decision-making and integration strategy
Product approach for safety and governance: controls, audit ability, monitoring, and risk management
Ways to stand out from the crowd:
Experience implementing enterprise-grade governance for agent systems (controls, audit-ability, monitoring, policy enforcement) in production autonomous workflows
Demonstrated wins taking new/open-source agent constructs from POC to production adoption, with clear business impact (cycle time, quality, cost, reliability)
Built and scaled an agent platform or agent developer experience used by multiple teams (SDKs, templates, reference apps, reusable building blocks)
Clear point of view and real examples on build-vs-use decisions—when to adopt OSS/3P vs build internal primitives—and how to operationalize the choice
Deep experience with agent evaluation at scale (long-horizon tasks, tool correctness, reliability testing, automated regressions, offline/online feedback loops)

Other signals

leading strategy and execution for NVIDIA’s agentic developer platform
operationalize the best approaches into reusable building blocks, integrations, and governance mechanisms
platform helps teams safely ship more autonomous systems at NVIDIA scale

Read full job description

As a Senior Engineering Manager for Agentic Systems & Platform Architecture, you will lead the strategy and execution for NVIDIA’s agentic developer platform—deeply understanding how teams across the company build, evaluate, and improve autonomous agents, and turning those evolving patterns into scalable platform capabilities. You will identify gaps and friction, drive rapid proof-of-concepts on emerging agent constructs and ecosystem tools, and operationalize the best approaches into reusable building blocks, integrations, and governance mechanisms that accelerate developer productivity and agent quality. If you’re passionate about staying at the forefront of agent architectures and turning experimentation into real business impact, this role offers a chance to build a platform. This platform helps teams safely ship more autonomous systems at NVIDIA scale.

What you will be doing:

Track and deeply understand evolving agent development patterns across NVIDIA and the broader ecosystem
Identify gaps and friction in current agent architectures, and translate insights into a platform strategy that boosts developer velocity and agent quality—backed by evaluations, benchmarking, and feedback loops
Assess and integrate open source and third-party tools where they add leverage; drive clear build-vs-use decisions
Architect and integrate high-performance data pipelines, RAG systems, vector databases, and GPU-optimized training and inference workflows.
Lead integration of the AI Data Platform into NVIDIA’s on-prem AI Factory, optimizing GPU-to-storage throughput, data locality, and distributed inference performance.
Establish and enforce robust Agent Governance policies across the platform, covering model/tool usage, data lineage, and ensuring adherence to compliance and Responsible AI frameworks.
Design, implement, and maintain a centralized Agent Safety Toolkit, providing developers with pre-vetted components for input/output guardrails and prompt injection defenses
Lead and grow a high-performing team along with a multi-functional community to standardize procedures and scale adoption

What we need to see:

Bachelor’s degree in CS/Engineering or equivalent experience
10+ overall years in software engineering, including 4+ years managing high-performing teams
Strong hands-on experience with evolving agent architectures and open-source libraries; deep expertise in LLM/agent architectures—leading POCs and integrating them into real business use cases with measurable adoption/impact
Ability to turn fast-moving, ambiguous problem spaces into clear platform strategy, roadmap, and outcomes
Proven track record building multi-team developer platforms (APIs/SDKs, reusable components, reference implementations)
Experience building evaluation/benchmarking systems for agent workflows (metrics, regression, feedback loops)
Strong judgment integrating OSS/3P tools; clear build-vs-use decision-making and integration strategy
Product approach for safety and governance: controls, audit ability, monitoring, and risk management
Strong leadership and executive communication (engineering, product, security, research)

Ways to stand out from the crowd:

Experience implementing enterprise-grade governance for agent systems (controls, audit-ability, monitoring, policy enforcement) in production autonomous workflows
Demonstrated wins taking new/open-source agent constructs from POC to production adoption, with clear business impact (cycle time, quality, cost, reliability)
Built and scaled an agent platform or agent developer experience used by multiple teams (SDKs, templates, reference apps, reusable building blocks)
Clear point of view and real examples on build-vs-use decisions—when to adopt OSS/3P vs build internal primitives—and how to operationalize the choice
Deep experience with agent evaluation at scale (long-horizon tasks, tool correctness, reliability testing, automated regressions, offline/online feedback loops)

We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you! Ready to take your career to the next level? Apply now and join the innovation!

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000 USD - 431,250 USD.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until February 27, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.