AI Performance Engineer Intern

Intel Intel · Semiconductors · Shanghai, China

AI Performance Engineer Intern at Intel focused on analyzing silicon chip performance for deep learning, conducting large-scale benchmarks, designing automation tools for data collection and analysis, and researching new architectural features for GPUs, CPUs, and SoCs. The role involves system-level modeling, testing, characterization, and performance-per-watt analysis, with a strong emphasis on understanding deep learning models and frameworks.

What you'd actually do

  1. Perform large-scale benchmarks for competitive analysis and comparative analysis.
  2. Design tools to automate product definitions, data collection, test case execution, and results analysis. Provide detailed data analysis of functionality, performance, and latency.
  3. Research new architectural features for next generation GPU, CPU and SOC to follow the latest silicon design.
  4. Tackle multi-variable problems via system level modeling, testing and characterization, trend analysis/projection, and model verification. Brings depth and expertise in performance-per-watt analysis.

Skills

Required

  • BS/MS or equivalent experience in relevant discipline (CE, CS&E, CS, AI)
  • Excellent C/C++/Python programming and software design skills
  • Performance modelling, profiling, debug, and code optimization
  • knowledge of deep learning models
  • Experience of popular AI framework such as Pytorch or Tensorflow or TensorRT
  • Quick self-learning capability

Nice to have

  • Architectural knowledge of CPU, GPU or SoC
  • Familiar with distributed training / inference
  • Knowledge of Flops/Memory BW test, mix precision, Op fusion, graph optimization
  • Expertise on LLM/AIGC, such as LLaMA/DeepSeek/Kimi/Qwen

What the JD emphasized

  • performance optimization
  • performance analysis
  • performance-per-watt analysis
  • performance modelling
  • performance

Other signals

  • benchmarking
  • performance analysis
  • optimization