Research Intern, Inference (fall 2026)

Together AI Together AI · Data AI · San Francisco, CA · Research

Research intern focused on building efficient, scalable, and reliable serving systems for large foundation models, involving distributed inference, compiler optimization, and hardware optimization.

What you'd actually do

  1. Design and conduct rigorous experiments to validate hypotheses
  2. Communicate the plans, progress, and results of projects to the broader team
  3. Document findings in scientific publications and blog posts

Skills

Required

  • Machine Learning fundamentals
  • Deep Learning fundamentals
  • PyTorch
  • JAX
  • Python
  • Transformer architectures
  • foundation models

Nice to have

  • CUDA programming
  • model optimization techniques
  • hardware acceleration approaches
  • open-source machine learning projects

What the JD emphasized

  • foundation models
  • efficient machine learning
  • ML systems
  • MLSys
  • ICLR

Other signals

  • research
  • inference
  • serving systems
  • foundation models
  • optimization