AI Hire Signal
JobsCompaniesTrendsInsightsWeekly
JobsStrategy timeline
AI Hire Signal

Tracking AI hiring across 200+ US tech companies. Stage, salary, and stack signals on every role — refreshed weekly.

Contact

Browse

JobsCompaniesTrendsInsightsWeekly

Resources

AboutSitemapRobots

Legal

PrivacyTerms
© 2026 AI Hire Signal·Not affiliated with companies shown
Together AI

Together AI

Data AI · Open-source model infra

HQ
San Francisco, US
Website
together.ai

Currently tracking 20 active AI roles, with 14 new openings in the last 4 weeks. Primary focus: Serve · Engineering. Salary range $160k–$300k (avg $226k).

Hiring
20 / 20
Momentum (4w)
·0 0%
14 opens last 4w · 14 prior 4w
Salary range · avg $226k
$160k–$300k
USD · disclosed roles only
Tracked since
Jan '24
last role yesterday
Hiring velocityscroll left for older weeks
2 new roles
Jan 15
1 new role
Jun 3
2 new roles
Jan 13
1 new role
20
1 new role
27
1 new role
Feb 24
1 new role
Mar 24
1 new role
Apr 28
1 new role
May 12
3 new roles
Jun 2
1 new role
23
2 new roles
Aug 18
1 new role
25
1 new role
Oct 27
1 new role
Nov 3
1 new role
17
1 new role
Jan 5
3 new roles
19
1 new role
Feb 16
3 new roles
23
1 new role
Mar 2
7 new roles
9
3 new roles
30
7 new roles
Apr 6
1 new role
13
4 new roles
27
2 new roles
May 4

Jobs (16)

20 AI · 53 total active
FilteredCountryUnited States×
Show
Active onlyAI only (≥ 7)
Stage
AllPost-train · 1Serve · 19
Function
AllEngineering · 18Research · 2
Country
AllUnited States · 16Netherlands · 3
Sort
AI scoreRecentTitle
TitleStageFunctionLocationFirst seenAI score
Research Engineer, Core ML
Research Engineer role focused on improving inference efficiency and unifying it with RL/post-training systems for production-grade AI APIs. The role involves end-to-end ownership of critical systems, translating frontier ideas into robust infrastructure, and shipping measurable improvements in latency, throughput, cost, and model quality at scale.
ServePost-trainResearchSan Francisco, CAFeb 1810
Forward Deployed Engineer (Inference & Post-Training)
Forward Deployed Engineer focused on optimizing inference engines and fine-tuning pipelines for production AI teams, acting as a technical partner to strategic customers. Responsibilities include inference engine optimization, performance tuning, post-training/fine-tuning (LoRA, SFT, DPO, RLHF, GRPO), customer alignment, onboarding, and providing product feedback.
ServePost-trainEngineeringSan Francisco, CA6d ago9
Senior Machine Learning Engineer, Voice AI
Senior ML Engineer focused on optimizing the model serving layer for voice AI workloads, including speech-to-text and text-to-speech models. The role involves hands-on work with inference engines, GPU optimization, batching strategies, and ensuring new model architectures can be productionized efficiently. The goal is to achieve best-in-class latency and reliability for real-time voice applications.
ServeEngineeringSan Francisco, CA6w ago9
Research Engineer, Frontier Speculative Decoding
Research Engineer focused on translating internal model training research into production-ready deployments by fine-tuning general-purpose models into specialized tools. This involves designing novel speculative algorithms, data curation, hyperparameter tuning, and checkpoint evaluation, with a focus on accuracy-efficiency tradeoffs for generative AI models.
Post-trainServeResearchSan Francisco, CANov '259
Systems Research Engineer, GPU Programming
This role focuses on optimizing and developing GPU-accelerated kernels and algorithms for ML/AI applications, requiring expertise in GPU programming (CUDA, Triton) and performance profiling. The engineer will collaborate with modeling, hardware, and software teams to enhance AI system efficiency and co-design GPU architectures.
ServeEngineeringSan Francisco, CAJan '249
AI Researcher, Core ML (Turbo)
AI Researcher focused on the intersection of efficient inference algorithms, architectures, engines, and post-training/RL systems for production-scale API services. The role involves advancing inference efficiency, unifying inference with RL/post-training, and owning critical systems.
ServePost-trainEngineeringSan Francisco, CAJan '249
Forward Deployed Engineer (GPU Clusters)
The Forward Deployed Engineer (FDE) will be a technical partner to customers building large-scale AI models, focusing on GPU cluster infrastructure, networking, storage, and orchestration to ensure stability, optimize performance, and facilitate platform adoption. This role involves hardening clusters, tuning orchestration layers (Kubernetes/SLURM), debugging low-level bottlenecks, building reference designs, and leading benchmarking exercises.
ServeEngineeringSan Francisco, CA2w ago8
Engineering Manager, Model Serving
Engineering Manager for Together AI's Model Serving platform, focusing on delivering world-class inference and fine-tuning in public APIs and customer deployments. Responsibilities include owning SLAs, improving testing/deployment/monitoring, building self-serve tooling, defining configuration best practices for inference engines, leading incident response, and mentoring team members. Requires 5+ years operating production ML inference or training systems at scale and 2+ years in senior IC or tech lead roles, with deep expertise in Kubernetes, multi-cluster orchestration, and ML serving frameworks.
ServePost-trainEngineeringSan Francisco, CAMar 58
Machine Learning Engineer
Machine Learning Engineer at Together AI focused on developing and scaling production systems for LLM inference and fine-tuning APIs. Requires strong experience in high-performance, distributed systems and the LLM inference ecosystem.
ServePost-trainEngineeringSan Francisco, CAJan '258
Machine Learning Engineer - Inference
Machine Learning Engineer focused on optimizing and enhancing the performance of AI inference systems, working with state-of-the-art large language models to ensure efficient and effective operation at scale. Responsibilities include designing and building production systems, optimizing runtime inference services, and creating supporting tools and documentation.
ServeEngineeringSan Francisco, CAJun '248
Senior Platform Engineer, Voice AI
Senior Platform Engineer for Together AI's Voice AI platform, focusing on the API and infrastructure layer for real-time speech-to-text and text-to-speech models. The role involves building WebSocket and HTTP APIs, designing autoscaling for latency-sensitive streaming, and ensuring platform reliability for production voice agents.
ServeEngineeringSan Francisco, CA6w ago7
Senior Backend Engineer, Inference Platform
Senior Backend Engineer focused on building and optimizing the inference platform for advanced generative AI models, including LLMs and multimodal models, at scale. The role involves optimizing latency, throughput, and resource allocation across tens of thousands of GPUs, collaborating with researchers to productionize frontier models, and contributing to open-source inference projects.
ServeEngineeringSan Francisco, CAAug '257
Machine Learning, Platform Engineer
Machine Learning Platform Engineer at Together AI, focusing on building a container platform, optimizing autoscaling, minimizing cold starts, and improving end-to-end model performance for custom models and dedicated inference. The role involves optimizing inference across the stack, including CUDA kernels, PyTorch, inference engines, and container orchestration.
ServeEngineeringSan Francisco, CAAug '257
AI Infrastructure Engineer
AI Infrastructure Engineer responsible for keeping user-facing services and production systems running smoothly, applying engineering principles and automation to operating environments. Focuses on systems, availability, reliability, and scalability, with interests in algorithms and distributed systems. Builds and runs infrastructure using Ansible, Terraform, and Kubernetes, and designs monitoring systems.
ServeEngineeringSan Francisco, CAJun '257
Senior Software Engineer - Together Cloud Infrastructure
Senior Software Engineer focused on building and operating a high-performance, global AI cloud infrastructure platform. This includes designing and maintaining backend services for hardware management, IaaS software layer for GPU data centers, high-performance object storage for pretraining datasets, and advanced observability stacks for distributed pretraining. The role also involves architecture and research for decentralized AI workloads and contributing to the open-source platform.
ServeDataEngineeringSan Francisco, CAJun '257
Solutions Architect
Solutions Architect at Together AI to work with customers and prospects to create business value through Generative AI applications. This role involves acting as a technical advisor, running demonstrations and POCs, collaborating with sales, building relationships with customer leadership, delivering feedback to product/engineering/research, and building educational content. Requires 5+ years in a customer-facing technical role with 2+ years in pre-sales, strong technical background in AI/ML/GPU, understanding of LLM training/fine-tuning/inference, Python/JavaScript proficiency, and familiarity with infrastructure services.
ServeEngineeringSan Francisco, CAJan '257