Together AI currently has 22 active AI-related job listings. The majority of these roles, 95%, are focused on serving infrastructure, with one role in post-training. The company is primarily hiring for Engineering positions, with 20 roles available, and is seeking candidates in the United States and the Netherlands. Frequent technical tags include model serving, inference infrastructure, and fine-tuning, suggesting a focus on deployment and optimization of AI models. In the last 30 days, Together AI added 5 new AI roles, a 150% increase from the previous 30-day period.
Currently tracking 20 active AI roles, with 14 new openings in the last 4 weeks. Primary focus: Serve · Engineering. Salary range $160k–$300k (avg $226k).
Together AI currently has 24 active AI-related roles in our index. The most common open titles are: Solutions Architect (2), AI Infrastructure Engineer, AI Researcher, Core ML (Turbo), Backend Software Engineer — Data Platform & AI Data Products, Customer Support Engineer (Inference). Most positions are in Engineering and Research.
Together AI's active AI hiring is concentrated in: serving infrastructure (96%), post-training (4%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
Together AI is hiring AI talent in: United States (19 roles), Netherlands (2 roles), United Kingdom (1 role).
Job postings at Together AI most frequently reference: inference infra, model serving, fine tuning, llm observability, audio speech.
In the past 30 days, Together AI has posted 6 new AI-related roles.
| Title | Stage | AI score |
|---|---|---|
| Staff Machine Learning Engineer, Voice AI Staff ML Engineer focused on optimizing the model serving layer for voice AI applications, including speech-to-text and text-to-speech models, with a focus on latency, throughput, and GPU utilization using inference engines like TRT-LLM and SGLang. The role involves building evaluation frameworks, supporting model partners, and shaping the architecture for next-generation voice models. | Serve | 9 |
| Forward Deployed Engineer (Inference & Post-Training) Forward Deployed Engineer focused on optimizing inference engines and fine-tuning pipelines for production AI teams, acting as a technical partner to strategic customers. Responsibilities include inference engine optimization, performance tuning, post-training/fine-tuning (LoRA, SFT, DPO, RLHF, GRPO), customer alignment, onboarding, and providing product feedback. | ServePost-train |
| 9 |
| Senior Machine Learning Engineer, Voice AI Senior ML Engineer focused on optimizing the model serving layer for voice AI workloads, including speech-to-text and text-to-speech models. The role involves hands-on work with inference engines, GPU optimization, batching strategies, and ensuring new model architectures can be productionized efficiently. The goal is to achieve best-in-class latency and reliability for real-time voice applications. | Serve | 9 |
| Systems Research Engineer, GPU Programming This role focuses on optimizing and developing GPU-accelerated kernels and algorithms for ML/AI applications, requiring expertise in GPU programming (CUDA, Triton) and performance profiling. The engineer will collaborate with modeling, hardware, and software teams to enhance AI system efficiency and co-design GPU architectures. | Serve | 9 |
| AI Researcher, Core ML (Turbo) AI Researcher focused on the intersection of efficient inference algorithms, architectures, engines, and post-training/RL systems for production-scale API services. The role involves advancing inference efficiency, unifying inference with RL/post-training, and owning critical systems. | ServePost-train | 9 |
| Staff Engineer, Distributed Storage and HPC & AI Infrastructure Staff Engineer focused on designing and delivering multi-petabyte storage systems optimized for AI training and inference workloads. Responsibilities include architecting high-performance parallel filesystems and object stores, building Kubernetes-native storage operators, optimizing data paths for high throughput, and implementing intelligent caching and data distribution strategies. The role requires deep expertise in distributed storage systems, Kubernetes, and programming in Go and Python. | Serve | 8 |
| Machine Learning Engineer Machine Learning Engineer at Together AI focused on developing and scaling production systems for LLM inference and fine-tuning APIs. Requires strong experience in high-performance, distributed systems and the LLM inference ecosystem. | ServePost-train | 8 |
| Machine Learning Engineer - Inference Machine Learning Engineer focused on optimizing and enhancing the performance of AI inference systems, working with state-of-the-art large language models to ensure efficient and effective operation at scale. Responsibilities include designing and building production systems, optimizing runtime inference services, and creating supporting tools and documentation. | Serve | 8 |
| Staff Platform Engineer, Voice AI Staff Platform Engineer for Together AI's Voice AI platform, focusing on the architecture and reliability of real-time API layers, autoscaling for latency-sensitive workloads, and building the observability platform for voice infrastructure. The role requires deep expertise in distributed systems, real-time streaming, and Kubernetes, with a strong product intuition for developer platforms. | Serve | 7 |
| AI Infrastructure Engineer AI Infrastructure Engineer responsible for keeping user-facing services and production systems running smoothly, specializing in systems, availability, reliability, and scalability, with interests in algorithms and distributed systems. Builds and runs infrastructure using Ansible, Terraform, and Kubernetes, and develops monitoring systems. | Serve | 7 |
| Senior Platform Engineer, Voice AI Senior Platform Engineer for Together AI's Voice AI platform, focusing on the API and infrastructure layer for real-time speech-to-text and text-to-speech models. The role involves building WebSocket and HTTP APIs, designing autoscaling for latency-sensitive streaming, and ensuring platform reliability for production voice agents. | Serve | 7 |
| Senior Backend Engineer, Inference Platform Senior Backend Engineer focused on building and optimizing the inference platform for advanced generative AI models, including LLMs and multimodal models, at scale. The role involves optimizing latency, throughput, and resource allocation across tens of thousands of GPUs, collaborating with researchers to productionize frontier models, and contributing to open-source inference projects. | Serve | 7 |
| Machine Learning, Platform Engineer Machine Learning Platform Engineer at Together AI, focusing on building a container platform, optimizing autoscaling, minimizing cold starts, and improving end-to-end model performance for custom models and dedicated inference. The role involves optimizing inference across the stack, including CUDA kernels, PyTorch, inference engines, and container orchestration. | Serve | 7 |
| Senior Software Engineer - Together Cloud Infrastructure Senior Software Engineer focused on building and operating a high-performance, global AI cloud infrastructure platform. This includes designing and maintaining backend services for hardware management, IaaS software layer for GPU data centers, high-performance object storage for pretraining datasets, and advanced observability stacks for distributed pretraining. The role also involves architecture and research for decentralized AI workloads and contributing to the open-source platform. | ServeData | 7 |
| Solutions Architect Solutions Architect at Together AI to work with customers and prospects to create business value through Generative AI applications. This role involves acting as a technical advisor, running demonstrations and POCs, collaborating with sales, building relationships with customer leadership, delivering feedback to product/engineering/research, and building educational content. Requires 5+ years in a customer-facing technical role with 2+ years in pre-sales, strong technical background in AI/ML/GPU, understanding of LLM training/fine-tuning/inference, Python/JavaScript proficiency, and familiarity with infrastructure services. | Serve | 7 |