Distinguished Engineer – High Performance AI

NVIDIA · Semiconductors · Santa Clara, CA +5 · Remote

Distinguished Engineer role focused on building groundbreaking agentic AI systems for the CUDA ecosystem, encompassing multi-agent runtimes, orchestration, data/evaluation pipelines, training/inference stacks, and GPU-accelerated execution. The role involves defining technical strategy, co-designing solutions with hardware/software teams, developing evaluation frameworks, and driving architecture across the AI stack.

What you'd actually do

  1. Set strategy and lead execution for agentic AI systems for the CUDA ecosystem, defining roadmaps and measurable success metrics (performance, quality, reliability, developer productivity).
  2. Co-design agentic system solutions with software, hardware and algorithm teams; influence and adopt new capabilities as they become available.
  3. Develop reproducible, high-fidelity evaluation frameworks covering performance, quality and developer productivity.
  4. Collaborate across the AI stack and help drive architecture and key technical decisions —from hardware through compilers/toolchains, kernels/libraries, frameworks, distributed training, and inference/serving—and with model and research/engineering teams.
  5. Scale impact through leadership: mentor and grow senior technical talent.

Skills

Required

  • Bachelor’s degree in Computer Science, Electrical Engineering, or related field (or equivalent experience)
  • Strong C/C++ and Python programming skills
  • solid software engineering fundamentals
  • ability to set engineering standards and review architecture at scale
  • Experience with GPU programming and performance optimization (CUDA or equivalent)

Nice to have

  • MS or PhD preferred
  • Track record building/evaluating deep learning models, coding agents and developer tooling, and driving broad adoption across teams or customers.
  • Demonstrated ability to optimize and deploy high-performance models, including on resource-constrained platforms.
  • Deep expertise in GPU performance optimizations, evidenced by benchmark wins or published results.
  • Publications or open-source leadership in deep learning, multi-agent systems, reinforcement learning, or AI systems; contributions to widely used repos or standards.
  • Experience leading projects end-end, mentoring small teams; ability to drive concepts to production.
  • Recognized technical leadership (e.g., setting platform direction, creating widely used architectures/APIs, or establishing evaluation/benchmarking standards).

What the JD emphasized

  • 17+ years industry and/or academia experience with AI systems development
  • strong exposure to building foundational models, agents or orchestration frameworks
  • hands-on experience with deep learning frameworks and modern inference stacks
  • Proven track record leading large, cross-team efforts from concept through production, including navigating ambiguity, aligning stakeholders, and delivering measurable outcomes.

Other signals

  • building agentic AI systems
  • multi-agent runtimes and orchestration
  • training and inference stacks
  • GPU-accelerated execution
  • production-grade engineering solutions