Machine Learning Engineer

Together AI Together AI · Data AI · San Francisco, CA · Engineering

Machine Learning Engineer at Together AI focused on developing and scaling production systems for LLM inference and fine-tuning APIs. Requires strong experience in high-performance, distributed systems and the LLM inference ecosystem.

What you'd actually do

  1. Design and build the production systems that power the Together Cloud inference and fine-tuning APIs, enabling reliability and performance at scale
  2. Partner with researchers, engineers, product managers, and designers to bring new features and research capabilities to the world
  3. Analyze and improve efficiency, scalability, and stability of various system resources
  4. Conduct design and code reviews
  5. Create services, tools & developer documentation

Skills

Required

  • Python
  • Go
  • Rust
  • C/C++
  • LLM inference ecosystem
  • distributed systems
  • runtime inference services

Nice to have

  • vLLM
  • SGLang
  • TRT

What the JD emphasized

  • inference and fine tune LLMs
  • implementing runtime systems that perform inference at scale
  • LLM inference ecosystem
  • implementing runtime inference services at scale

Other signals

  • LLM inference
  • fine-tuning LLMs
  • runtime systems
  • production quality code