Performance Engineer Intern, Deep Learning and Hpc - 2026

NVIDIA NVIDIA · Semiconductors · Shanghai, China

NVIDIA is seeking a Performance Engineer Intern to support performance testing of datacenter products and applications, focusing on AI workloads like LLM training and inference, as well as HPC. The role involves benchmarking, profiling, analyzing performance, developing automation scripts, and collaborating with internal teams. The intern will aggregate and report testing data for sales, marketing, and engineering teams, and assist in developing tools and processes for automated testing.

What you'd actually do

  1. Benchmark, profile, and analyze the performance of AI workloads specifically tailored for large-scale LLM training and inference, as well as High-Performance Computing (HPC) on NVIDIA supercomputers and distributed systems.
  2. Aggregate and produce written reports with the testing data for internal sales, marketing, SW, and HW teams.
  3. Develop Python scripts to automate the testing of various applications.
  4. Collaborate with internal teams to debug and improve performance issues.
  5. Assist with the development of tools and processes that improve our ability to perform automated testing.

Skills

Required

  • programming and debugging with scripting languages such as Python or Unix shell
  • Strong data analysis skills
  • summarize findings in a written report
  • Hands-on experience with Linux based systems
  • Familiarity using a container platform such as Docker or Singularity
  • Experience with compiling and running software from source code

Nice to have

  • CI/CD pipelines and modern DevOps practices
  • cloud provisioning and scheduling tools (Kubernetes, SLURM)
  • Curiosity about GPUs, TPUs, cloud and performance benchmarking
  • Familiar with ML/DL techniques, algorithms and frameworks like TensorFlow or PyTorch
  • Experience in AI model inference deployment and training launching
  • Background of system-level problem solving

What the JD emphasized

  • large-scale LLM training and inference

Other signals

  • performance testing
  • LLM training and inference
  • automation
  • benchmarking