Together AI — AI hiring signals

Currently tracking 20 active AI roles, with 14 new openings in the last 4 weeks. Primary focus: Serve · Engineering. Salary range $160k–$300k (avg $226k).

Hiring

20 / 20

Momentum (4w)

·0 0%

14 opens last 4w · 14 prior 4w

Salary range · avg $226k

$160k–$300k

USD · disclosed roles only

Tracked since

Jan '24

last role yesterday

Hiring velocityscroll left for older weeks

2 new roles

Jan 15

1 new role

Jun 3

2 new roles

Jan 13

1 new role

Feb 24

1 new role

Mar 24

1 new role

Apr 28

1 new role

May 12

3 new roles

Jun 2

1 new role

2 new roles

Aug 18

1 new role

Oct 27

1 new role

Nov 3

1 new role

Jan 5

3 new roles

1 new role

Feb 16

3 new roles

1 new role

Mar 2

7 new roles

3 new roles

7 new roles

Apr 6

1 new role

4 new roles

2 new roles

May 4

Jobs (43)

20 AI · 53 total active

Title	Stage	Function	Location	First seen	AI score
Forward Deployed Engineer (Inference & Post-Training) Forward Deployed Engineer focused on optimizing inference engines and fine-tuning pipelines for production AI teams, acting as a technical partner to strategic customers. Responsibilities include inference engine optimization, performance tuning, post-training/fine-tuning (LoRA, SFT, DPO, RLHF, GRPO), customer alignment, onboarding, and providing product feedback.	ServePost-train	Engineering	San Francisco, CA	6d ago	9
Senior Machine Learning Engineer, Voice AI Senior ML Engineer focused on optimizing the model serving layer for voice AI workloads, including speech-to-text and text-to-speech models. The role involves hands-on work with inference engines, GPU optimization, batching strategies, and ensuring new model architectures can be productionized efficiently. The goal is to achieve best-in-class latency and reliability for real-time voice applications.	Serve	Engineering	San Francisco, CA	6w ago	9
Systems Research Engineer, GPU Programming This role focuses on optimizing and developing GPU-accelerated kernels and algorithms for ML/AI applications, requiring expertise in GPU programming (CUDA, Triton) and performance profiling. The engineer will collaborate with modeling, hardware, and software teams to enhance AI system efficiency and co-design GPU architectures.	Serve	Engineering	San Francisco, CA	Jan '24	9
AI Researcher, Core ML (Turbo) AI Researcher focused on the intersection of efficient inference algorithms, architectures, engines, and post-training/RL systems for production-scale API services. The role involves advancing inference efficiency, unifying inference with RL/post-training, and owning critical systems.	ServePost-train	Engineering	San Francisco, CA	Jan '24	9
Forward Deployed Engineer (GPU Clusters) The Forward Deployed Engineer (FDE) will be a technical partner to customers building large-scale AI models, focusing on GPU cluster infrastructure, networking, storage, and orchestration to ensure stability, optimize performance, and facilitate platform adoption. This role involves hardening clusters, tuning orchestration layers (Kubernetes/SLURM), debugging low-level bottlenecks, building reference designs, and leading benchmarking exercises.	Serve	Engineering	San Francisco, CA	2w ago	8
Engineering Manager, Model Serving Engineering Manager for Together AI's Model Serving platform, focusing on delivering world-class inference and fine-tuning in public APIs and customer deployments. Responsibilities include owning SLAs, improving testing/deployment/monitoring, building self-serve tooling, defining configuration best practices for inference engines, leading incident response, and mentoring team members. Requires 5+ years operating production ML inference or training systems at scale and 2+ years in senior IC or tech lead roles, with deep expertise in Kubernetes, multi-cluster orchestration, and ML serving frameworks.	ServePost-train	Engineering	San Francisco, CA	Mar 5	8
LLM Inference Frameworks and Optimization Engineer Seeking an Inference Frameworks and Optimization Engineer to design, develop, and optimize distributed inference engines for multimodal and language models. Focus on low-latency, high-throughput inference, GPU/accelerator optimizations, and software-hardware co-design for efficient large-scale AI deployment.	Serve	Engineering	Remote	Mar '25	8
Machine Learning Engineer Machine Learning Engineer at Together AI focused on developing and scaling production systems for LLM inference and fine-tuning APIs. Requires strong experience in high-performance, distributed systems and the LLM inference ecosystem.	ServePost-train	Engineering	San Francisco, CA	Jan '25	8
Machine Learning Engineer - Inference Machine Learning Engineer focused on optimizing and enhancing the performance of AI inference systems, working with state-of-the-art large language models to ensure efficient and effective operation at scale. Responsibilities include designing and building production systems, optimizing runtime inference services, and creating supporting tools and documentation.	Serve	Engineering	San Francisco, CA	Jun '24	8
Senior Platform Engineer, Voice AI Senior Platform Engineer for Together AI's Voice AI platform, focusing on the API and infrastructure layer for real-time speech-to-text and text-to-speech models. The role involves building WebSocket and HTTP APIs, designing autoscaling for latency-sensitive streaming, and ensuring platform reliability for production voice agents.	Serve	Engineering	San Francisco, CA	6w ago	7
Backend Engineer Senior Backend/Distributed Systems Engineer to build and maintain the Together AI Sandbox service, focusing on API platform performance, reliability, and scalability. Responsibilities include designing core backend components, performing research for AI workloads, and ensuring code quality through design and code reviews.	Serve	Engineering	Amsterdam, Netherlands	Mar 10	7
Together Cloud Infrastructure Engineer This role focuses on building and maintaining the AI cloud infrastructure, including services for hardware management, IaaS software layer for GPU data centers, high-performance object storage for pretraining, and advanced observability stacks. The engineer will work on the core Together AI platform, create services and tools, and develop testing frameworks for robustness and fault-tolerance.	ServeData	Engineering	Amsterdam, Netherlands	Jan 20	7
Staff Engineer, Distributed Storage,HPC & AI Infrastructure Staff Engineer focused on designing and delivering multi-petabyte distributed storage systems optimized for AI training and inference workloads. Responsibilities include architecting high-performance parallel filesystems and object stores, integrating cutting-edge technologies, driving cost optimization, and building Kubernetes-native storage operators and self-service platforms. The role requires deep expertise in distributed storage, Kubernetes, and performance optimization for GPU/HPC clusters, with strong coding skills in Go and Python.	Serve	Engineering	Amsterdam, Netherlands	Jan 20	7
Senior Backend Engineer, Inference Platform Senior Backend Engineer focused on building and optimizing the inference platform for advanced generative AI models, including LLMs and multimodal models, at scale. The role involves optimizing latency, throughput, and resource allocation across tens of thousands of GPUs, collaborating with researchers to productionize frontier models, and contributing to open-source inference projects.	Serve	Engineering	San Francisco, CA	Aug '25	7
Machine Learning, Platform Engineer Machine Learning Platform Engineer at Together AI, focusing on building a container platform, optimizing autoscaling, minimizing cold starts, and improving end-to-end model performance for custom models and dedicated inference. The role involves optimizing inference across the stack, including CUDA kernels, PyTorch, inference engines, and container orchestration.	Serve	Engineering	San Francisco, CA	Aug '25	7
AI Infrastructure Engineer AI Infrastructure Engineer responsible for keeping user-facing services and production systems running smoothly, applying engineering principles and automation to operating environments. Focuses on systems, availability, reliability, and scalability, with interests in algorithms and distributed systems. Builds and runs infrastructure using Ansible, Terraform, and Kubernetes, and designs monitoring systems.	Serve	Engineering	San Francisco, CA	Jun '25	7
Senior Software Engineer - Together Cloud Infrastructure Senior Software Engineer focused on building and operating a high-performance, global AI cloud infrastructure platform. This includes designing and maintaining backend services for hardware management, IaaS software layer for GPU data centers, high-performance object storage for pretraining datasets, and advanced observability stacks for distributed pretraining. The role also involves architecture and research for decentralized AI workloads and contributing to the open-source platform.	ServeData	Engineering	San Francisco, CA	Jun '25	7
Solutions Architect Solutions Architect at Together AI to work with customers and prospects to create business value through Generative AI applications. This role involves acting as a technical advisor, running demonstrations and POCs, collaborating with sales, building relationships with customer leadership, delivering feedback to product/engineering/research, and building educational content. Requires 5+ years in a customer-facing technical role with 2+ years in pre-sales, strong technical background in AI/ML/GPU, understanding of LLM training/fine-tuning/inference, Python/JavaScript proficiency, and familiarity with infrastructure services.	Serve	Engineering	San Francisco, CA	Jan '25	7
Staff Engineer, Customer Insights Staff Engineer to build and scale the customer-facing visibility layer for Together's AI Cloud, focusing on historical analytics, activity history, audit logs, event timelines, notifications, and investigation workflows. The role will evolve these foundations into AI-first investigation and insight workflows that summarize activity, explain anomalies, and provide trustworthy context for human operators and autonomous agents. This is a hands-on role designing event, query, delivery, and governance systems, and building user-facing workflows for enterprise customers.	—	Engineering	San Francisco, CA	1w ago	5
Technical Account Manager (TAM), AI Factory This role is a Technical Account Manager focused on the infrastructure supporting large-scale AI GPU deployments for a strategic enterprise customer. The TAM will be the primary technical point of contact, responsible for the end-to-end technical relationship across compute, networking, storage, and facilities, ensuring smooth delivery and operational health. Responsibilities include issue lifecycle management, hardware lifecycle management, advising on infrastructure stack best practices, owning the observability strategy, coordinating operations, and managing capacity expansions. The role requires deep expertise in GPU infrastructure, large-scale networking, enterprise storage, and DC operations, with experience in customer-facing technical roles and AI/HPC infrastructure.	—	Engineering	San Francisco, CA	2w ago	5
Director, Support Engineering This role leads and scales the customer support function for Together AI, focusing on both API support (serverless/dedicated inference, billing) and GPU support (large-scale training infrastructure). It's a player-coach position requiring hands-on involvement in complex escalations, managing support engineers, defining KPIs, and improving support workflows and tooling. The role requires strong technical depth in AI infrastructure, distributed systems, and experience with SLA-driven operations.	—	Engineering	San Francisco, CA	2w ago	5
Customer Support Engineer (GPU Cluster) Customer Support Engineer role focused on supporting customers using Together AI's GPU clusters for training, fine-tuning, and inference. The role involves resolving complex technical challenges, acting as a product expert, and collaborating with Engineering and Product teams. Requires experience in customer-facing technical roles, familiarity with AI/ML, GPU technologies, and infrastructure services like Kubernetes.	—	Engineering	San Francisco, CA	5w ago	5
Backend Software Engineer — Data Platform & AI Data Products Backend Software Engineer focused on building data platform infrastructure and LLM-adjacent data products. The role involves designing and developing backend services for event streams, access layers, and APIs, as well as creating services for prompt categorization, enrichment, and metadata. The engineer will apply AI augmentation mindset to their own development and the systems they build, with a focus on production backend systems, distributed systems, and data modeling.	Serve	Engineering	San Francisco, CA	Mar 11	5
Customer Support Engineer (Inference), India Customer Support Engineer role at Together AI, focusing on supporting customers with their training, fine-tuning, and inference solutions. The role involves deep technical problem-solving on GPU clusters and AI services, acting as a product expert and a liaison between customers and internal engineering/product teams. Requires strong technical background in AI, ML, and HPC, with experience in customer-facing technical support.	ServePost-train	Engineering	Remote	Mar 10	5
Engineering Manager / Tech Lead Engineering Manager / Tech Lead for the Sandbox team, responsible for building and operating isolated, secure compute environments for AI code execution, including reinforcement learning workflows, LLM code interpreters, and AI agents. This role involves technical leadership, people management, hiring, and collaborating with product and other engineering teams. The team builds sandbox infrastructure, SDKs, platform integrations, and developer tooling.	—	Engineering	Amsterdam, Netherlands	Feb 27	5
Customer Support Engineer (GPU Cluster), India Customer Support Engineer for GPU Clusters at Together AI, focusing on resolving technical challenges for customers building training, fine-tuning, and inference solutions. The role involves being a product expert, collaborating with engineering and product teams, and transforming customer insights into product improvements. Requires experience in customer-facing technical roles, AI/ML/GPU technologies, and infrastructure services like Kubernetes.	—	Engineering	Remote	Aug '25	5
Senior Software Engineer - Together Cloud Platform Senior Backend Engineer role focused on building and scaling the AI Acceleration Cloud platform, which virtualizes ML hardware and provides self-serve AI cloud services for ML practitioners. Responsibilities include developing distributed GPU scheduling systems, global management planes, and customer-facing cloud platform services, ensuring high availability and performance.	—	Engineering	San Francisco, CA	Jun '25	5
AI infrastructure Engineer (SRE) Amsterdam AI infrastructure Engineer (SRE) responsible for keeping user-facing services and production systems running smoothly, specializing in systems, availability, reliability, and scalability. The role involves building and running infrastructure with Ansible, Terraform, and Kubernetes, implementing monitoring and observability, and debugging production issues.	—	Engineering	EUROPE	Apr '25	5
IT Engineer IT Engineer role focused on providing hands-on support for employees, managing IT infrastructure (identity, devices, SaaS), and contributing to IT initiatives and documentation. Requires experience with Okta, Google Workspace, and MDM for macOS/Linux.	—	Engineering	Amsterdam, Netherlands	2w ago	0
Finance Analytics Engineer This role is for a Finance Analytics Engineer who will own the data layer for the Finance team, building models, pipelines, and reporting infrastructure. Responsibilities include owning the dbt transformation layer, orchestrating runs with Airflow, delivering dashboards, partnering with finance teams, setting data quality standards, and building a data foundation to support AI automation. Requires 5+ years of experience in analytics engineering or data engineering, with expertise in SQL, dbt, Snowflake, and Airflow, and strong dimensional modeling fundamentals.	—	Engineering	San Francisco, CA	4w ago	0
Staff Backend Engineer - Commerce Staff Backend Engineer to own the technical vision, architecture, and execution of the commerce platform powering Together's Cloud products, including usage-based billing, payment processing, customer-facing analytics, and product entitlements. This role requires deep expertise in backend systems, distributed systems, and API design, with a focus on scalability, fault tolerance, and influencing cross-functional teams.	—	Engineering	San Francisco, CA	5w ago	0
Director, Data Center Operations This role is for a Director of Data Center Operations at Together AI, focusing on building and scaling the physical infrastructure for AI workloads. The responsibilities include designing and commissioning data center white space, managing power and cooling systems, and building a break-fix team. It is a ground-floor, builder role with ownership over operational foundations.	—	Engineering	San Francisco, CA	5w ago	0
Analytics Engineer — Data Warehouse Staff Analytics Engineer role focused on building and maintaining the data warehouse transformation layer using dbt and Airflow. The role involves dimensional modeling, data quality, governance, and stakeholder management, with a focus on financial and billing data. The company is an AI infrastructure and platform company.	—	Engineering	San Francisco, CA	5w ago	0
Lead/Manager Site Reliability Engineering Team (Amsterdam) Lead a team of Site Reliability Engineers (SRE) responsible for keeping user-facing services and production systems running smoothly. The role involves managing, developing, and coaching the SRE team, building and running infrastructure using Ansible, Terraform, and Kubernetes, implementing monitoring systems, designing operational processes, debugging production issues, and planning infrastructure growth. The company is an AI research company, but this role is focused on the underlying infrastructure and operations, not direct AI/ML model development or research.	—	Engineering	EUROPE	6w ago	0
Staff Engineer, Product UI Platform Staff Engineer to own and evolve the Product UI Platform, the architectural foundation for full-stack features across the web surface. This role will drive the technical direction of the Next.js/typescript/nodejs web runtime, BFF layer, and application integration patterns, evolving the product runtime from a monolithic growth architecture to a scalable, modular, and high-leverage platform.	—	Engineering	San Francisco, CA	Mar 12	0
Data Warehouse Engineer Staff Data Warehouse Engineer responsible for designing, operating, and evolving a data warehouse stack (bronze/silver/gold), owning core data models and metrics, and establishing data quality and governance standards. The role involves building and maintaining data pipelines, designing analytics-ready models, leading Master Data Management patterns, implementing data quality checks, and building a business semantic layer. The engineer will use SQL, Python, and Spark, mentor junior engineers, and contribute to technical standards.	—	Engineering	San Francisco, CA	Mar 12	0
Senior Program Manager, Data Center Build This role manages the physical build-out and fit-out of data centers specifically designed to support high-density AI compute environments, including GPU clusters and potentially liquid cooling systems. The responsibilities involve overseeing construction projects from contract negotiation through commissioning, coordinating with various vendors and stakeholders, managing budgets and schedules, and ensuring compliance with safety and security standards. While the role is critical for enabling AI workloads, it focuses on the physical infrastructure rather than the AI models or software themselves.	—	Engineering	Remote	Mar 10	0
Staff Engineer, API Core Platform Staff Engineer to found the API Platform team, focusing on building and scaling core systems and architecture for Together AI's mission-critical APIs. Responsibilities include improving the backend API layer, designing next-gen API platform solutions, and ensuring reliability, performance, and consistency across public and client APIs. The role requires deep hands-on experience with critical-path code and building platforms that unify engineering efforts.	—	Engineering	San Francisco, CA	Feb 24	0
Senior Network Engineer (Amsterdam) Senior Network Engineer responsible for designing, implementing, and maintaining network infrastructure for high-performance compute networks, with a focus on AI training workloads. Requires expertise in routing, switching, network security, automation, and specific hardware/protocols like RoCE or Infiniband.	—	Engineering	Amsterdam, Netherlands	Jan 20	0
Senior Technical Recruiter This role is for a Senior Technical Recruiter at Together AI, a company building an AI Acceleration Cloud. The recruiter will partner with engineering leaders to drive hiring for core engineering functions, manage the candidate journey, provide market intelligence, and design interview processes. The company focuses on the generative AI lifecycle, AI cloud infrastructure, and open-source AI research.	—	Engineering	San Francisco, CA	Oct '25	0
Senior Developer Productivity Engineer Senior Developer Productivity Engineer at Together AI, a research-driven AI company. Focuses on optimizing engineering workflows, CI/CD pipelines, and building shared tooling to accelerate software delivery. Requires strong experience in DevOps, CI/CD, and Python/Go/TypeScript.	—	Engineering	San Francisco, CA	Jun '25	0
Senior Data Engineer Senior Data Engineer to build and operate data infrastructure for billing, analytics, and BI tools. Requires expertise in stream processing, real-time analytics, and IaC. Role involves designing, building, and scaling data platforms in a fast-paced environment.	—	Engineering	San Francisco, CA	May '25	0
Senior Network Engineer Senior Network Engineer responsible for designing, implementing, and maintaining network infrastructure for AI company's user-facing services and production systems. Focus on routing, switching, network security, and protocols, with an emphasis on automation and HPC-based data center networking. Experience with large-scale hybrid data center networks, TCP/IP, BGP, OSPF, VXLAN, EVPN, QoS, and network automation tools (Python, Ansible). Proficient in network troubleshooting tools and Linux environments. Experience with cloud networks (AWS, GCP, Azure) and multi-vendor network devices (Cisco, Arista, Juniper, Mellanox). Preferred knowledge of RoCE, Infiniband, Docker, Kubernetes, Slurm, and AI training workloads.	—	Engineering	San Francisco, CA	Jan '25	0

Title

Stage

Function

Location

First seen

AI score

Forward Deployed Engineer (Inference & Post-Training)

Forward Deployed Engineer focused on optimizing inference engines and fine-tuning pipelines for production AI teams, acting as a technical partner to strategic customers. Responsibilities include inference engine optimization, performance tuning, post-training/fine-tuning (LoRA, SFT, DPO, RLHF, GRPO), customer alignment, onboarding, and providing product feedback.

ServePost-train

Engineering

San Francisco, CA

6d ago

Senior Machine Learning Engineer, Voice AI

Senior ML Engineer focused on optimizing the model serving layer for voice AI workloads, including speech-to-text and text-to-speech models. The role involves hands-on work with inference engines, GPU optimization, batching strategies, and ensuring new model architectures can be productionized efficiently. The goal is to achieve best-in-class latency and reliability for real-time voice applications.

Serve

Engineering

San Francisco, CA

6w ago

Systems Research Engineer, GPU Programming

This role focuses on optimizing and developing GPU-accelerated kernels and algorithms for ML/AI applications, requiring expertise in GPU programming (CUDA, Triton) and performance profiling. The engineer will collaborate with modeling, hardware, and software teams to enhance AI system efficiency and co-design GPU architectures.

Serve

Engineering

San Francisco, CA

Jan '24

AI Researcher, Core ML (Turbo)

AI Researcher focused on the intersection of efficient inference algorithms, architectures, engines, and post-training/RL systems for production-scale API services. The role involves advancing inference efficiency, unifying inference with RL/post-training, and owning critical systems.

ServePost-train

Engineering

San Francisco, CA

Jan '24

Forward Deployed Engineer (GPU Clusters)

The Forward Deployed Engineer (FDE) will be a technical partner to customers building large-scale AI models, focusing on GPU cluster infrastructure, networking, storage, and orchestration to ensure stability, optimize performance, and facilitate platform adoption. This role involves hardening clusters, tuning orchestration layers (Kubernetes/SLURM), debugging low-level bottlenecks, building reference designs, and leading benchmarking exercises.

Serve

Engineering

San Francisco, CA

2w ago

Engineering Manager, Model Serving

Engineering Manager for Together AI's Model Serving platform, focusing on delivering world-class inference and fine-tuning in public APIs and customer deployments. Responsibilities include owning SLAs, improving testing/deployment/monitoring, building self-serve tooling, defining configuration best practices for inference engines, leading incident response, and mentoring team members. Requires 5+ years operating production ML inference or training systems at scale and 2+ years in senior IC or tech lead roles, with deep expertise in Kubernetes, multi-cluster orchestration, and ML serving frameworks.

ServePost-train

Engineering

San Francisco, CA

Mar 5

LLM Inference Frameworks and Optimization Engineer

Seeking an Inference Frameworks and Optimization Engineer to design, develop, and optimize distributed inference engines for multimodal and language models. Focus on low-latency, high-throughput inference, GPU/accelerator optimizations, and software-hardware co-design for efficient large-scale AI deployment.

Serve

Engineering

Remote

Mar '25

Machine Learning Engineer

Machine Learning Engineer at Together AI focused on developing and scaling production systems for LLM inference and fine-tuning APIs. Requires strong experience in high-performance, distributed systems and the LLM inference ecosystem.

ServePost-train

Engineering

San Francisco, CA

Jan '25

Machine Learning Engineer - Inference

Machine Learning Engineer focused on optimizing and enhancing the performance of AI inference systems, working with state-of-the-art large language models to ensure efficient and effective operation at scale. Responsibilities include designing and building production systems, optimizing runtime inference services, and creating supporting tools and documentation.

Serve

Engineering

San Francisco, CA

Jun '24

Senior Platform Engineer, Voice AI

Senior Platform Engineer for Together AI's Voice AI platform, focusing on the API and infrastructure layer for real-time speech-to-text and text-to-speech models. The role involves building WebSocket and HTTP APIs, designing autoscaling for latency-sensitive streaming, and ensuring platform reliability for production voice agents.

Serve

Engineering

San Francisco, CA

6w ago

Backend Engineer

Senior Backend/Distributed Systems Engineer to build and maintain the Together AI Sandbox service, focusing on API platform performance, reliability, and scalability. Responsibilities include designing core backend components, performing research for AI workloads, and ensuring code quality through design and code reviews.

Serve

Engineering

Amsterdam, Netherlands

Mar 10

Together Cloud Infrastructure Engineer

This role focuses on building and maintaining the AI cloud infrastructure, including services for hardware management, IaaS software layer for GPU data centers, high-performance object storage for pretraining, and advanced observability stacks. The engineer will work on the core Together AI platform, create services and tools, and develop testing frameworks for robustness and fault-tolerance.

ServeData

Engineering

Amsterdam, Netherlands

Jan 20

Staff Engineer, Distributed Storage,HPC & AI Infrastructure

Staff Engineer focused on designing and delivering multi-petabyte distributed storage systems optimized for AI training and inference workloads. Responsibilities include architecting high-performance parallel filesystems and object stores, integrating cutting-edge technologies, driving cost optimization, and building Kubernetes-native storage operators and self-service platforms. The role requires deep expertise in distributed storage, Kubernetes, and performance optimization for GPU/HPC clusters, with strong coding skills in Go and Python.

Serve

Engineering

Amsterdam, Netherlands

Jan 20

Senior Backend Engineer, Inference Platform

Senior Backend Engineer focused on building and optimizing the inference platform for advanced generative AI models, including LLMs and multimodal models, at scale. The role involves optimizing latency, throughput, and resource allocation across tens of thousands of GPUs, collaborating with researchers to productionize frontier models, and contributing to open-source inference projects.

Serve

Engineering

San Francisco, CA

Aug '25

Machine Learning, Platform Engineer

Machine Learning Platform Engineer at Together AI, focusing on building a container platform, optimizing autoscaling, minimizing cold starts, and improving end-to-end model performance for custom models and dedicated inference. The role involves optimizing inference across the stack, including CUDA kernels, PyTorch, inference engines, and container orchestration.

Serve

Engineering

San Francisco, CA

Aug '25

AI Infrastructure Engineer

AI Infrastructure Engineer responsible for keeping user-facing services and production systems running smoothly, applying engineering principles and automation to operating environments. Focuses on systems, availability, reliability, and scalability, with interests in algorithms and distributed systems. Builds and runs infrastructure using Ansible, Terraform, and Kubernetes, and designs monitoring systems.

Serve

Engineering

San Francisco, CA

Jun '25

Senior Software Engineer - Together Cloud Infrastructure

Senior Software Engineer focused on building and operating a high-performance, global AI cloud infrastructure platform. This includes designing and maintaining backend services for hardware management, IaaS software layer for GPU data centers, high-performance object storage for pretraining datasets, and advanced observability stacks for distributed pretraining. The role also involves architecture and research for decentralized AI workloads and contributing to the open-source platform.

ServeData

Engineering

San Francisco, CA

Jun '25

Solutions Architect

Solutions Architect at Together AI to work with customers and prospects to create business value through Generative AI applications. This role involves acting as a technical advisor, running demonstrations and POCs, collaborating with sales, building relationships with customer leadership, delivering feedback to product/engineering/research, and building educational content. Requires 5+ years in a customer-facing technical role with 2+ years in pre-sales, strong technical background in AI/ML/GPU, understanding of LLM training/fine-tuning/inference, Python/JavaScript proficiency, and familiarity with infrastructure services.

Serve

Engineering

San Francisco, CA

Jan '25

Staff Engineer, Customer Insights

Staff Engineer to build and scale the customer-facing visibility layer for Together's AI Cloud, focusing on historical analytics, activity history, audit logs, event timelines, notifications, and investigation workflows. The role will evolve these foundations into AI-first investigation and insight workflows that summarize activity, explain anomalies, and provide trustworthy context for human operators and autonomous agents. This is a hands-on role designing event, query, delivery, and governance systems, and building user-facing workflows for enterprise customers.

—

Engineering

San Francisco, CA

1w ago

Technical Account Manager (TAM), AI Factory

This role is a Technical Account Manager focused on the infrastructure supporting large-scale AI GPU deployments for a strategic enterprise customer. The TAM will be the primary technical point of contact, responsible for the end-to-end technical relationship across compute, networking, storage, and facilities, ensuring smooth delivery and operational health. Responsibilities include issue lifecycle management, hardware lifecycle management, advising on infrastructure stack best practices, owning the observability strategy, coordinating operations, and managing capacity expansions. The role requires deep expertise in GPU infrastructure, large-scale networking, enterprise storage, and DC operations, with experience in customer-facing technical roles and AI/HPC infrastructure.

—

Engineering

San Francisco, CA

2w ago

Director, Support Engineering

This role leads and scales the customer support function for Together AI, focusing on both API support (serverless/dedicated inference, billing) and GPU support (large-scale training infrastructure). It's a player-coach position requiring hands-on involvement in complex escalations, managing support engineers, defining KPIs, and improving support workflows and tooling. The role requires strong technical depth in AI infrastructure, distributed systems, and experience with SLA-driven operations.

—

Engineering

San Francisco, CA

2w ago

Customer Support Engineer (GPU Cluster)

Customer Support Engineer role focused on supporting customers using Together AI's GPU clusters for training, fine-tuning, and inference. The role involves resolving complex technical challenges, acting as a product expert, and collaborating with Engineering and Product teams. Requires experience in customer-facing technical roles, familiarity with AI/ML, GPU technologies, and infrastructure services like Kubernetes.

—

Engineering

San Francisco, CA

5w ago

Backend Software Engineer — Data Platform & AI Data Products

Backend Software Engineer focused on building data platform infrastructure and LLM-adjacent data products. The role involves designing and developing backend services for event streams, access layers, and APIs, as well as creating services for prompt categorization, enrichment, and metadata. The engineer will apply AI augmentation mindset to their own development and the systems they build, with a focus on production backend systems, distributed systems, and data modeling.

Serve

Engineering

San Francisco, CA

Mar 11

Customer Support Engineer (Inference), India

Customer Support Engineer role at Together AI, focusing on supporting customers with their training, fine-tuning, and inference solutions. The role involves deep technical problem-solving on GPU clusters and AI services, acting as a product expert and a liaison between customers and internal engineering/product teams. Requires strong technical background in AI, ML, and HPC, with experience in customer-facing technical support.

ServePost-train

Engineering

Remote

Mar 10

Engineering Manager / Tech Lead

Engineering Manager / Tech Lead for the Sandbox team, responsible for building and operating isolated, secure compute environments for AI code execution, including reinforcement learning workflows, LLM code interpreters, and AI agents. This role involves technical leadership, people management, hiring, and collaborating with product and other engineering teams. The team builds sandbox infrastructure, SDKs, platform integrations, and developer tooling.

—

Engineering

Amsterdam, Netherlands

Feb 27

Customer Support Engineer (GPU Cluster), India

Customer Support Engineer for GPU Clusters at Together AI, focusing on resolving technical challenges for customers building training, fine-tuning, and inference solutions. The role involves being a product expert, collaborating with engineering and product teams, and transforming customer insights into product improvements. Requires experience in customer-facing technical roles, AI/ML/GPU technologies, and infrastructure services like Kubernetes.

—

Engineering

Remote

Aug '25

Senior Software Engineer - Together Cloud Platform

Senior Backend Engineer role focused on building and scaling the AI Acceleration Cloud platform, which virtualizes ML hardware and provides self-serve AI cloud services for ML practitioners. Responsibilities include developing distributed GPU scheduling systems, global management planes, and customer-facing cloud platform services, ensuring high availability and performance.

—

Engineering

San Francisco, CA

Jun '25

AI infrastructure Engineer (SRE) Amsterdam

AI infrastructure Engineer (SRE) responsible for keeping user-facing services and production systems running smoothly, specializing in systems, availability, reliability, and scalability. The role involves building and running infrastructure with Ansible, Terraform, and Kubernetes, implementing monitoring and observability, and debugging production issues.

—

Engineering

EUROPE

Apr '25

IT Engineer

IT Engineer role focused on providing hands-on support for employees, managing IT infrastructure (identity, devices, SaaS), and contributing to IT initiatives and documentation. Requires experience with Okta, Google Workspace, and MDM for macOS/Linux.

—

Engineering

Amsterdam, Netherlands

2w ago

Finance Analytics Engineer

This role is for a Finance Analytics Engineer who will own the data layer for the Finance team, building models, pipelines, and reporting infrastructure. Responsibilities include owning the dbt transformation layer, orchestrating runs with Airflow, delivering dashboards, partnering with finance teams, setting data quality standards, and building a data foundation to support AI automation. Requires 5+ years of experience in analytics engineering or data engineering, with expertise in SQL, dbt, Snowflake, and Airflow, and strong dimensional modeling fundamentals.

—

Engineering

San Francisco, CA

4w ago

Staff Backend Engineer - Commerce

Staff Backend Engineer to own the technical vision, architecture, and execution of the commerce platform powering Together's Cloud products, including usage-based billing, payment processing, customer-facing analytics, and product entitlements. This role requires deep expertise in backend systems, distributed systems, and API design, with a focus on scalability, fault tolerance, and influencing cross-functional teams.

—

Engineering

San Francisco, CA

5w ago

Director, Data Center Operations

This role is for a Director of Data Center Operations at Together AI, focusing on building and scaling the physical infrastructure for AI workloads. The responsibilities include designing and commissioning data center white space, managing power and cooling systems, and building a break-fix team. It is a ground-floor, builder role with ownership over operational foundations.

—

Engineering

San Francisco, CA

5w ago

Analytics Engineer — Data Warehouse

Staff Analytics Engineer role focused on building and maintaining the data warehouse transformation layer using dbt and Airflow. The role involves dimensional modeling, data quality, governance, and stakeholder management, with a focus on financial and billing data. The company is an AI infrastructure and platform company.

—

Engineering

San Francisco, CA

5w ago

Lead/Manager Site Reliability Engineering Team (Amsterdam)

Lead a team of Site Reliability Engineers (SRE) responsible for keeping user-facing services and production systems running smoothly. The role involves managing, developing, and coaching the SRE team, building and running infrastructure using Ansible, Terraform, and Kubernetes, implementing monitoring systems, designing operational processes, debugging production issues, and planning infrastructure growth. The company is an AI research company, but this role is focused on the underlying infrastructure and operations, not direct AI/ML model development or research.

—

Engineering

EUROPE

6w ago

Staff Engineer, Product UI Platform

Staff Engineer to own and evolve the Product UI Platform, the architectural foundation for full-stack features across the web surface. This role will drive the technical direction of the Next.js/typescript/nodejs web runtime, BFF layer, and application integration patterns, evolving the product runtime from a monolithic growth architecture to a scalable, modular, and high-leverage platform.

—

Engineering

San Francisco, CA

Mar 12

Data Warehouse Engineer

Staff Data Warehouse Engineer responsible for designing, operating, and evolving a data warehouse stack (bronze/silver/gold), owning core data models and metrics, and establishing data quality and governance standards. The role involves building and maintaining data pipelines, designing analytics-ready models, leading Master Data Management patterns, implementing data quality checks, and building a business semantic layer. The engineer will use SQL, Python, and Spark, mentor junior engineers, and contribute to technical standards.

—

Engineering

San Francisco, CA

Mar 12

Senior Program Manager, Data Center Build

This role manages the physical build-out and fit-out of data centers specifically designed to support high-density AI compute environments, including GPU clusters and potentially liquid cooling systems. The responsibilities involve overseeing construction projects from contract negotiation through commissioning, coordinating with various vendors and stakeholders, managing budgets and schedules, and ensuring compliance with safety and security standards. While the role is critical for enabling AI workloads, it focuses on the physical infrastructure rather than the AI models or software themselves.

—

Engineering

Remote

Mar 10

Staff Engineer, API Core Platform

Staff Engineer to found the API Platform team, focusing on building and scaling core systems and architecture for Together AI's mission-critical APIs. Responsibilities include improving the backend API layer, designing next-gen API platform solutions, and ensuring reliability, performance, and consistency across public and client APIs. The role requires deep hands-on experience with critical-path code and building platforms that unify engineering efforts.

—

Engineering

San Francisco, CA

Feb 24

Senior Network Engineer (Amsterdam)

Senior Network Engineer responsible for designing, implementing, and maintaining network infrastructure for high-performance compute networks, with a focus on AI training workloads. Requires expertise in routing, switching, network security, automation, and specific hardware/protocols like RoCE or Infiniband.

—

Engineering

Amsterdam, Netherlands

Jan 20

Senior Technical Recruiter

This role is for a Senior Technical Recruiter at Together AI, a company building an AI Acceleration Cloud. The recruiter will partner with engineering leaders to drive hiring for core engineering functions, manage the candidate journey, provide market intelligence, and design interview processes. The company focuses on the generative AI lifecycle, AI cloud infrastructure, and open-source AI research.

—

Engineering

San Francisco, CA

Oct '25

Senior Developer Productivity Engineer

Senior Developer Productivity Engineer at Together AI, a research-driven AI company. Focuses on optimizing engineering workflows, CI/CD pipelines, and building shared tooling to accelerate software delivery. Requires strong experience in DevOps, CI/CD, and Python/Go/TypeScript.

—

Engineering

San Francisco, CA

Jun '25

Senior Data Engineer

Senior Data Engineer to build and operate data infrastructure for billing, analytics, and BI tools. Requires expertise in stream processing, real-time analytics, and IaC. Role involves designing, building, and scaling data platforms in a fast-paced environment.

—

Engineering

San Francisco, CA

May '25

Senior Network Engineer

Senior Network Engineer responsible for designing, implementing, and maintaining network infrastructure for AI company's user-facing services and production systems. Focus on routing, switching, network security, and protocols, with an emphasis on automation and HPC-based data center networking. Experience with large-scale hybrid data center networks, TCP/IP, BGP, OSPF, VXLAN, EVPN, QoS, and network automation tools (Python, Ansible). Proficient in network troubleshooting tools and Linux environments. Experience with cloud networks (AWS, GCP, Azure) and multi-vendor network devices (Cisco, Arista, Juniper, Mellanox). Preferred knowledge of RoCE, Infiniband, Docker, Kubernetes, Slurm, and AI training workloads.

—

Engineering

San Francisco, CA

Jan '25