Systems Research Engineer Intern - GPU Programming (fall 2026)

Together AI Together AI · Data AI · San Francisco, CA · Research

Internship role focused on optimizing GPU-accelerated kernels and algorithms for ML/AI applications, co-designing GPU kernels and model architecture with modeling teams, and contributing to efficient GPU architectures and programming models.

What you'd actually do

  1. Optimize and fine-tune GPU code to achieve better performance and scalability
  2. Collaborate with cross-functional teams to integrate GPU-accelerated solutions into existing software systems
  3. Stay up-to-date with the latest advancements in GPU programming techniques and technologies

Skills

Required

  • GPU programming
  • parallel computing
  • CUDA
  • Triton
  • ML/AI applications
  • performance profiling
  • optimization tools

Nice to have

  • co-design GPU kernels and model architecture
  • co-design efficient GPU architectures and programming models

What the JD emphasized

  • GPU programming
  • parallel computing
  • CUDA
  • Triton
  • performance profiling
  • optimization tools

Other signals

  • GPU programming
  • ML/AI applications
  • performance optimization