AI Researcher, Core ML (turbo)

Together AI Together AI · Data AI · San Francisco, CA · Research

AI Researcher focused on the intersection of efficient inference algorithms, architectures, engines, and post-training/RL systems for production-scale API services. The role involves advancing inference efficiency, unifying inference with RL/post-training, and owning critical systems.

What you'd actually do

  1. Advance inference efficiency end‑to‑end
  2. Unify inference with RL / post‑training
  3. Own critical systems at production scale
  4. Provide technical leadership (Staff level)

Skills

Required

  • ML systems
  • large-scale model training
  • inference
  • RL algorithms
  • training engines
  • kernels
  • serving systems
  • SGLang
  • vLLM
  • speculative decoding

Nice to have

  • post-training
  • RLHF
  • reward modeling
  • interpretability
  • asynchronous RL
  • rollout collection
  • scheduling
  • batching

What the JD emphasized

  • production scale
  • frontier models
  • full-stack ownership
  • deep expertise
  • complex technical projects end-to-end

Other signals

  • efficient inference
  • RL-driven training
  • production scale systems
  • frontier models