Research Engineer, Frontier Speculative Decoding

Together AI Together AI · Data AI · San Francisco, CA · Research

Research Engineer focused on translating internal model training research into production-ready deployments by fine-tuning general-purpose models into specialized tools. This involves designing novel speculative algorithms, data curation, hyperparameter tuning, and checkpoint evaluation, with a focus on accuracy-efficiency tradeoffs for generative AI models.

What you'd actually do

  1. Design and iterate on novel speculator algorithms, combining architectural innovations with carefully curated data to push the frontier of accuracy–efficiency tradeoffs.
  2. Be the critical link between raw data and a production-ready model, seeing your work directly impact our customers' success.
  3. Work in a fast-paced, high-impact role at the cutting edge of generative AI.
  4. Collaborate with a team of experts dedicated to solving real-world, high-performance challenges.
  5. You'll collaborate directly with customers to understand their needs, and work closely with our core inference and Applied ML research teams to integrate your work into the production platform.

Skills

Required

  • Python
  • PyTorch
  • SLURM and/or Kubernetes clusters
  • modern LLMs and generative models
  • distributed training frameworks (e.g., FSDP, DeepSpeed)
  • Bachelor’s, Master’s degree, or Ph.D. in Computer Science, Computer Engineering, or a related field, or equivalent practical experience

Nice to have

  • understanding customer specific needs
  • fine-tuning models
  • data curation and processing
  • hyperparameter tuning
  • checkpoint evaluation
  • building on top of existing training codebases
  • navigating complex code and contributing to its improvement
  • submitting and managing jobs in a high-performance computing environment

What the JD emphasized

  • critical link between raw data and a production-ready model
  • meticulous hyperparameter tuning
  • rigorous checkpoint evaluation
  • customer specific needs
  • highly efficient, specialized models
  • push the frontier of accuracy–efficiency tradeoffs
  • genuine love for data curation and processing
  • meticulous attention to detail
  • effective hyperparameter searches
  • tuning models for specific tasks
  • evaluating model checkpoints to ensure they meet strict quality, performance, and reliability standards

Other signals

  • translating internal model training research to production-ready deployment
  • fine-tuning models on internal data recipe and proprietary data
  • creating highly efficient, specialized models
  • design and iterate on novel speculator algorithms
  • push the frontier of accuracy–efficiency tradeoffs