Together AI

Scaling

AI Frontier · Open-source model infra

HQ: San Francisco, US
Website: together.ai

Currently tracking 20 active AI roles, with 14 new openings in the last 4 weeks. Primary focus: Serve · Engineering. Salary range $160k–$300k (avg $226k).

Hiring

20 / 20

Momentum (4w)

·0 0%

14 opens last 4w · 14 prior 4w

Salary range · avg $226k

$160k–$300k

USD · disclosed roles only

Tracked since

Jan '24

last role today

Hiring velocityscroll left for older weeks

2 new roles

Jan 15

1 new role

Jun 3

2 new roles

Jan 13

1 new role

Feb 24

1 new role

Mar 24

1 new role

Apr 28

1 new role

May 12

3 new roles

Jun 2

1 new role

2 new roles

Aug 18

1 new role

Nov 3

1 new role

Jan 5

3 new roles

1 new role

Feb 16

3 new roles

1 new role

Mar 2

7 new roles

3 new roles

7 new roles

Apr 6

1 new role

4 new roles

2 new roles

May 4

Jobs (20)

20 AI · 52 total active

Title	Stage	Function	Location	First seen	AI score
Research Engineer, Core ML Research Engineer role focused on improving inference efficiency and unifying it with RL/post-training systems for production-grade AI APIs. The role involves end-to-end ownership of critical systems, translating frontier ideas into robust infrastructure, and shipping measurable improvements in latency, throughput, cost, and model quality at scale.	ServePost-train	Research	San Francisco, CA	Feb 18	10
Forward Deployed Engineer (Inference & Post-Training) Forward Deployed Engineer focused on optimizing inference engines and fine-tuning pipelines for production AI teams, acting as a technical partner to strategic customers. Responsibilities include inference engine optimization, performance tuning, post-training/fine-tuning (LoRA, SFT, DPO, RLHF, GRPO), customer alignment, onboarding, and providing product feedback.	ServePost-train	Engineering	San Francisco, CA	3d ago	9
Senior Machine Learning Engineer, Voice AI Senior ML Engineer focused on optimizing the model serving layer for voice AI workloads, including speech-to-text and text-to-speech models. The role involves hands-on work with inference engines, GPU optimization, batching strategies, and ensuring new model architectures can be productionized efficiently. The goal is to achieve best-in-class latency and reliability for real-time voice applications.	Serve	Engineering	San Francisco, CA	6w ago	9
Research Engineer, Frontier Speculative Decoding Research Engineer focused on translating internal model training research into production-ready deployments by fine-tuning general-purpose models into specialized tools. This involves designing novel speculative algorithms, data curation, hyperparameter tuning, and checkpoint evaluation, with a focus on accuracy-efficiency tradeoffs for generative AI models.	Post-trainServe	Research	San Francisco, CA	Nov '25	9
Systems Research Engineer, GPU Programming This role focuses on optimizing and developing GPU-accelerated kernels and algorithms for ML/AI applications, requiring expertise in GPU programming (CUDA, Triton) and performance profiling. The engineer will collaborate with modeling, hardware, and software teams to enhance AI system efficiency and co-design GPU architectures.	Serve	Engineering	San Francisco, CA	Jan '24	9
AI Researcher, Core ML (Turbo) AI Researcher focused on the intersection of efficient inference algorithms, architectures, engines, and post-training/RL systems for production-scale API services. The role involves advancing inference efficiency, unifying inference with RL/post-training, and owning critical systems.	ServePost-train	Engineering	San Francisco, CA	Jan '24	9
Forward Deployed Engineer (GPU Clusters) The Forward Deployed Engineer (FDE) will be a technical partner to customers building large-scale AI models, focusing on GPU cluster infrastructure, networking, storage, and orchestration to ensure stability, optimize performance, and facilitate platform adoption. This role involves hardening clusters, tuning orchestration layers (Kubernetes/SLURM), debugging low-level bottlenecks, building reference designs, and leading benchmarking exercises.	Serve	Engineering	San Francisco, CA	1w ago	8
Engineering Manager, Model Serving Engineering Manager for Together AI's Model Serving platform, focusing on delivering world-class inference and fine-tuning in public APIs and customer deployments. Responsibilities include owning SLAs, improving testing/deployment/monitoring, building self-serve tooling, defining configuration best practices for inference engines, leading incident response, and mentoring team members. Requires 5+ years operating production ML inference or training systems at scale and 2+ years in senior IC or tech lead roles, with deep expertise in Kubernetes, multi-cluster orchestration, and ML serving frameworks.	ServePost-train	Engineering	San Francisco, CA	Mar 5	8
LLM Inference Frameworks and Optimization Engineer Seeking an Inference Frameworks and Optimization Engineer to design, develop, and optimize distributed inference engines for multimodal and language models. Focus on low-latency, high-throughput inference, GPU/accelerator optimizations, and software-hardware co-design for efficient large-scale AI deployment.	Serve	Engineering	Remote	Mar '25	8
Machine Learning Engineer Machine Learning Engineer at Together AI focused on developing and scaling production systems for LLM inference and fine-tuning APIs. Requires strong experience in high-performance, distributed systems and the LLM inference ecosystem.	ServePost-train	Engineering	San Francisco, CA	Jan '25	8
Machine Learning Engineer - Inference Machine Learning Engineer focused on optimizing and enhancing the performance of AI inference systems, working with state-of-the-art large language models to ensure efficient and effective operation at scale. Responsibilities include designing and building production systems, optimizing runtime inference services, and creating supporting tools and documentation.	Serve	Engineering	San Francisco, CA	Jun '24	8
Senior Platform Engineer, Voice AI Senior Platform Engineer for Together AI's Voice AI platform, focusing on the API and infrastructure layer for real-time speech-to-text and text-to-speech models. The role involves building WebSocket and HTTP APIs, designing autoscaling for latency-sensitive streaming, and ensuring platform reliability for production voice agents.	Serve	Engineering	San Francisco, CA	6w ago	7
Backend Engineer Senior Backend/Distributed Systems Engineer to build and maintain the Together AI Sandbox service, focusing on API platform performance, reliability, and scalability. Responsibilities include designing core backend components, performing research for AI workloads, and ensuring code quality through design and code reviews.	Serve	Engineering	Amsterdam, Netherlands	Mar 10	7
Together Cloud Infrastructure Engineer This role focuses on building and maintaining the AI cloud infrastructure, including services for hardware management, IaaS software layer for GPU data centers, high-performance object storage for pretraining, and advanced observability stacks. The engineer will work on the core Together AI platform, create services and tools, and develop testing frameworks for robustness and fault-tolerance.	ServeData	Engineering	Amsterdam, Netherlands	Jan 20	7
Staff Engineer, Distributed Storage,HPC & AI Infrastructure Staff Engineer focused on designing and delivering multi-petabyte distributed storage systems optimized for AI training and inference workloads. Responsibilities include architecting high-performance parallel filesystems and object stores, integrating cutting-edge technologies, driving cost optimization, and building Kubernetes-native storage operators and self-service platforms. The role requires deep expertise in distributed storage, Kubernetes, and performance optimization for GPU/HPC clusters, with strong coding skills in Go and Python.	Serve	Engineering	Amsterdam, Netherlands	Jan 20	7
Senior Backend Engineer, Inference Platform Senior Backend Engineer focused on building and optimizing the inference platform for advanced generative AI models, including LLMs and multimodal models, at scale. The role involves optimizing latency, throughput, and resource allocation across tens of thousands of GPUs, collaborating with researchers to productionize frontier models, and contributing to open-source inference projects.	Serve	Engineering	San Francisco, CA	Aug '25	7
Machine Learning, Platform Engineer Machine Learning Platform Engineer at Together AI, focusing on building a container platform, optimizing autoscaling, minimizing cold starts, and improving end-to-end model performance for custom models and dedicated inference. The role involves optimizing inference across the stack, including CUDA kernels, PyTorch, inference engines, and container orchestration.	Serve	Engineering	San Francisco, CA	Aug '25	7
AI Infrastructure Engineer AI Infrastructure Engineer responsible for keeping user-facing services and production systems running smoothly, applying engineering principles and automation to operating environments. Focuses on systems, availability, reliability, and scalability, with interests in algorithms and distributed systems. Builds and runs infrastructure using Ansible, Terraform, and Kubernetes, and designs monitoring systems.	Serve	Engineering	San Francisco, CA	Jun '25	7
Senior Software Engineer - Together Cloud Infrastructure Senior Software Engineer focused on building and operating a high-performance, global AI cloud infrastructure platform. This includes designing and maintaining backend services for hardware management, IaaS software layer for GPU data centers, high-performance object storage for pretraining datasets, and advanced observability stacks for distributed pretraining. The role also involves architecture and research for decentralized AI workloads and contributing to the open-source platform.	ServeData	Engineering	San Francisco, CA	Jun '25	7
Solutions Architect Solutions Architect at Together AI to work with customers and prospects to create business value through Generative AI applications. This role involves acting as a technical advisor, running demonstrations and POCs, collaborating with sales, building relationships with customer leadership, delivering feedback to product/engineering/research, and building educational content. Requires 5+ years in a customer-facing technical role with 2+ years in pre-sales, strong technical background in AI/ML/GPU, understanding of LLM training/fine-tuning/inference, Python/JavaScript proficiency, and familiarity with infrastructure services.	Serve	Engineering	San Francisco, CA	Jan '25	7