Senior Ai-native Systems Software Engineer, Tensorrt

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

Senior engineer to architect and build an AI-native framework using AI agents for software development, focusing on scaling, performance optimization, and integrating SOTA models for inference.

What you'd actually do

  1. Architecting an AI-native framework: Help design and build a codebase and architecture that scales beyond human capacity, supporting large numbers of AI agents working in parallel to generate, test, and validate production-grade software.
  2. Scaling through agentic workflows: Improve the ratio of compute-to-software output by adopting and building AI-native tools, multi-agent orchestrators, and codebase harnesses that keep humans focused on the highest-value work..
  3. Rapid prototyping with SOTA models: Act as a technical scout, identifying industry and academic breakthroughs (e.g., new attention mechanisms, KV cache strategies) and dispatching AI agent swarms to prototype and integrate these capabilities into our framework.
  4. Delivering a great user experience: Ensure a seamless, high-performance path to production for the latest model families (LLMs, Diffusion, Audio, Vision and multi-modal models).
  5. Extreme performance optimization: Work at the intersection of Python orchestration and C++ engine-level optimizations to achieve major latency and throughput gains for critical customer use cases.

Skills

Required

  • BS, MS, or PhD in Computer Science, Computer Engineering, AI, or equivalent experience.
  • 4+ years of relevant software development experience.
  • Strong modern C++ skills: Proficiency with C++11/14/17 (or newer) and the STL, with an emphasis on clean, maintainable, performant code.
  • Deep learning familiarity: Experience with modern inference frameworks and an understanding of the architectural nuances of LLMs, Diffusion, and multi-modal models.
  • Systems thinking: Interest in how software architecture must evolve to support automated, agent-driven development and indefinitely scaling codebases.
  • End-to-end product sense: Ability to translate high-level customer needs into concrete technical requirements and user-centric solutions.
  • Pragmatic execution: Demonstrated ability to go from customer requests to production-quality software on tight timelines.
  • Collaborative mindset: Excellent communication skills and comfort working across internal organizations and with customers.

Nice to have

  • Agentic framework experience: Hands-on work with AI agent orchestrators or multi-agent coding frameworks, or experience building custom agentic coding harnesses for production software.
  • CUDA & kernel expertise: Experience with CUDA programming or exposure to kernel generation / autotuning efforts.
  • High-velocity prototyping: A track record of rapidly turning state-of-the-art papers into working prototypes in days, not weeks.
  • Performance profiling skills: Expertise in software performance analysis, profiling, and optimization (CPU and/or GPU), including using tooling to drive measurable wins.

What the JD emphasized

  • AI-native initiative
  • AI agents
  • agentic development framework
  • large numbers of AI agents working in parallel
  • multi-agent orchestrators
  • AI agent swarms
  • LLMs, Diffusion, Audio, Vision and multi-modal models
  • Extreme performance optimization
  • modern inference frameworks
  • architectural nuances of LLMs, Diffusion, and multi-modal models
  • automated, agent-driven development
  • indefinitely scaling codebases
  • Agentic framework experience
  • AI agent orchestrators or multi-agent coding frameworks
  • custom agentic coding harnesses
  • High-velocity prototyping
  • rapidly turning state-of-the-art papers into working prototypes

Other signals

  • AI-native initiative
  • AI agents
  • agentic development framework
  • LLMs, Diffusion, Audio, Vision and multi-modal models
  • Extreme performance optimization