Systems Performance Engineer, Agentic AI Workloads – New College Grad 2026

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +2

This role focuses on modeling, simulating, and analyzing the system-level performance of agentic AI workloads in datacenter environments. The engineer will develop simulators, characterize LLM serving traffic, identify performance bottlenecks, and provide architectural recommendations for next-generation AI systems. The role requires strong programming skills in C++ and Python, a solid understanding of queueing theory, traffic modeling, and statistics, as well as fundamentals of deep learning and LLM inference serving.

What you'd actually do

  1. Develop and extend C++ and Python simulators that model system-level network and compute traffic for agentic LLM workloads in datacenter environments
  2. Characterize real-world LLM serving workloads and distill them into representative simulator inputs
  3. Run simulations at scale and apply statistical techniques to post-process and interpret results
  4. Identify performance bottlenecks and translate findings into concrete architectural recommendations
  5. Collaborate with hardware, software, and research teams to influence the design of future AI systems

Skills

Required

  • C++
  • Python
  • queueing theory
  • traffic modeling
  • statistics
  • deep learning fundamentals
  • LLMs
  • inference serving frameworks

Nice to have

  • traffic or network simulators
  • roofline modeling
  • performance scaling of deep learning models
  • large-scale simulation campaigns
  • data pipelines
  • benchmarking ML inference workloads

What the JD emphasized

  • agentic LLM workloads
  • model, simulate, and reason about complex system-level traffic at scale
  • performance analysis
  • characterize huge datasets
  • LLM serving workloads

Other signals

  • modeling and simulation of agentic LLM workloads
  • performance analysis of AI infrastructure
  • characterizing LLM serving workloads