Baseten — AI hiring signals

Jobs (20)

22 AI · 60 total active

Title	Stage	Function	Location	First seen	AI score
Post-Training Research Engineer Baseten is seeking a Post-Training Research Engineer to build in-house tooling for post-training AI models at scale. This role involves deep technical dives into ML techniques, distributed computing, and systems-level concepts to support customer custom models, which are critical for Baseten's inference platform.	Post-train	Engineering	San Francisco, CA	7w ago	9
Software Engineer - GPU Kernels Software Engineer focused on optimizing GPU kernels for ML inference, including matrix multiplications, attention mechanisms, and quantization, using CUDA and PTX assembly.	Serve	Engineering	San Francisco, CA	Jul '25	9
Engineering Manager - Forward Deployed Engineering (LLM) Engineering Manager for Forward Deployed Engineering team focused on building, scaling, and optimizing LLM inference workloads for Baseten customers. This role involves hands-on technical ownership, team leadership, and collaboration with product and infrastructure teams to ensure best-in-class performance, reliability, and cost efficiency of AI applications on Baseten's platform. The role contributes to the core codebase and drives feature roadmap, acting as a player-coach.	ServeAgent	Engineering	San Francisco, CA	4d ago	8
Manager, Solutions Architect Manager for a Solutions Architect team focused on enabling customers to deploy and optimize AI/ML models, particularly LLMs, on Baseten's inference platform. The role involves leadership, technical guidance, customer discovery, and ensuring high performance, reliability, and cost efficiency of AI applications in production.	Serve	Engineering	San Francisco, CA	4d ago	8
Software Engineer - Voice AI (Inference Runtime) Software Engineer focused on building and optimizing the inference runtime for Voice AI models, including state-of-the-art open-source models. The role involves developing large-scale, real-time infrastructure for multi-model voice agents, reducing latency, increasing throughput, and improving GPU efficiency. It also includes designing iteration loops for voice model customization and customization.	ServeAgent	Engineering	San Francisco, CA	3w ago	8
Software Engineer - Model APIs Software Engineer role focused on optimizing and operating Model APIs for AI inference, involving distributed systems, model serving, and developer experience. The role emphasizes performance improvements, structured outputs, tool/function calling, and multi-modal serving.	ServeAgent	Engineering	San Francisco, CA	Oct '25	8
Engineering Manager - Model Performance Engineering Manager for Model Performance at Baseten, a company providing inference infrastructure for AI companies. The role involves leading a team of engineers to optimize ML model inference and performance, focusing on production-level AI/ML solutions and scaling large models. Requires a strong engineering background, leadership experience, and expertise in ML performance optimization, with hands-on work in areas like TensorRT, PyTorch, and CUDA.	Serve	Engineering	San Francisco, CA	Sep '24	8
Software Engineer - Model Performance Software Engineer focused on ML performance for LLM inference, optimizing techniques like quantization and speculative decoding, and debugging ML performance issues in libraries like TensorRT and PyTorch.	Serve	Engineering	San Francisco, CA	Mar '24	8
Solution Architect (AI/LLM Inference) Solution Architect role focused on AI/LLM inference, partnering with Sales and customers to design and deploy technical solutions. Responsibilities include customer discovery, technical scoping, leading demos, managing deployments, and driving POC execution. Requires an AI/ML background and customer-facing communication skills, with the ability to script and prototype.	Serve	Engineering	San Francisco, CA	2d ago	7
Applied AI Inference Engineer This role focuses on partnering with customers to architect, build, and deploy high-scale production AI applications on Baseten's platform. It involves owning the customer journey from exploration to deployment, translating business goals into reliable, observable services with clear quality, latency, and cost outcomes. The role blends engineering, product management, technical customer success, and pre-sales solution engineering.	ServeAgent	Engineering	San Francisco, CA	3w ago	7
AI Solutions Engineer AI Solutions Engineer role focused on partnering with customers to architect, build, and deploy high-scale production AI applications on Baseten's platform. This involves owning the customer journey from exploration to production, translating business goals into reliable, observable services with clear quality, latency, and cost outcomes. The role blends engineering, product management, technical customer success, and pre-sales solution engineering.	ShipServe	Engineering	San Francisco, CA	3w ago	7
Solution Architect Solution Architect role at Baseten, a company providing AI inference infrastructure. The role involves partnering with Sales and customers to understand business needs, design technical solutions, run technical discovery, and guide deployments and proofs of value. Responsibilities include customer discovery calls, technical scoping, leading demos, owning benchmarking and repeatable deployments across various AI modalities, advising on infrastructure tradeoffs, and driving POC execution. Requires an AI/ML background, strong customer-facing communication, and technical depth to scope solutions.	Serve	Engineering	San Francisco, CA	Feb 25	7
Software Engineer - AI Enablement Baseten is seeking an AI Enablement Engineer to own and develop AI-powered tooling and agent infrastructure for internal productivity. This role involves evaluating, customizing, and deploying AI coding agents and building custom internal agents for tasks like incident triage and codebase Q&A. The engineer will also track usage, measure impact, and stay updated on AI tooling advancements to enhance the engineering organization's effectiveness across the SDLC.	Agent	Engineering	San Francisco, CA	Feb 24	7
Software Engineer — GPU Networking & Distributed Systems Software Engineer focused on GPU Networking and Distributed Systems to optimize AI inference infrastructure, specifically for LLMs and multi-modal models. The role involves integrating RDMA, optimizing networking layers for disaggregated KV cache and WideEP, enabling fast startup speeds, and building observability tools for bleeding-edge hardware.	Serve	Engineering	San Francisco, CA	Feb 23	7
Software Engineer - Training Product Software Engineer focused on building and shipping training products for AI companies, working across the full stack from API to infrastructure, including fine-tuning models and partnering with research engineers. The role involves developing features like multi-node training and serverless RL, with a focus on developer experience and reliability.	Post-trainServe	Engineering	San Francisco, CA	Jan 22	7
Software Engineer, Model Performance Systems Software Engineer role focused on building and optimizing the performance of AI inference infrastructure, including benchmarking, hardware profiling, and developing automated testing and monitoring tools for LLMs.	Serve	Engineering	San Francisco, CA	Jan 7	7
Software Engineer - Training Infrastructure Software Engineer on the Training Infrastructure team responsible for architecting and leading development of the ML training platform, focusing on scheduling, storage, networking, reliability, and observability for research engineers and model developers.	Data	Engineering	San Francisco, CA	Aug '25	7
Software Engineer - Infrastructure Software Engineer focused on building and maintaining the ML inference platform, enabling high-performance deployment, scaling, and monitoring of AI models for production applications.	Serve	Engineering	San Francisco, CA	Mar '25	7
Software Engineer - Core Product Software Engineer on the Core Product team at Baseten, building and maintaining the core Baseten product that enables users to deploy and get value from ML models. The role involves working across the stack, including CLI tools, REST APIs, and the web application, with a focus on new feature development, API design, and bug fixing. Example initiatives include chains for multi-component workflows, asynchronous inference, model APIs, and model training for production inference.	ServeAgent	Engineering	San Francisco, CA	Jul '24	7
Forward Deployed Engineer The Forward Deployed Engineer partners with customers to architect, build, and deploy high-scale production AI applications on Baseten's platform. This role involves owning the customer journey from exploration to production, translating business goals into reliable services with clear quality, latency, and cost outcomes. It blends engineering, product management, technical customer success, and pre-sales solution engineering.	ServeAgent	Engineering	San Francisco, CA	Mar '24	7

Title

Stage

Function

Location

First seen

AI score

Post-Training Research Engineer

Baseten is seeking a Post-Training Research Engineer to build in-house tooling for post-training AI models at scale. This role involves deep technical dives into ML techniques, distributed computing, and systems-level concepts to support customer custom models, which are critical for Baseten's inference platform.

Post-train

Engineering

San Francisco, CA

7w ago

Software Engineer - GPU Kernels

Software Engineer focused on optimizing GPU kernels for ML inference, including matrix multiplications, attention mechanisms, and quantization, using CUDA and PTX assembly.

Serve

Engineering

San Francisco, CA

Jul '25

Engineering Manager - Forward Deployed Engineering (LLM)

Engineering Manager for Forward Deployed Engineering team focused on building, scaling, and optimizing LLM inference workloads for Baseten customers. This role involves hands-on technical ownership, team leadership, and collaboration with product and infrastructure teams to ensure best-in-class performance, reliability, and cost efficiency of AI applications on Baseten's platform. The role contributes to the core codebase and drives feature roadmap, acting as a player-coach.

ServeAgent

Engineering

San Francisco, CA

4d ago

Manager, Solutions Architect

Manager for a Solutions Architect team focused on enabling customers to deploy and optimize AI/ML models, particularly LLMs, on Baseten's inference platform. The role involves leadership, technical guidance, customer discovery, and ensuring high performance, reliability, and cost efficiency of AI applications in production.

Serve

Engineering

San Francisco, CA

4d ago

Software Engineer - Voice AI (Inference Runtime)

Software Engineer focused on building and optimizing the inference runtime for Voice AI models, including state-of-the-art open-source models. The role involves developing large-scale, real-time infrastructure for multi-model voice agents, reducing latency, increasing throughput, and improving GPU efficiency. It also includes designing iteration loops for voice model customization and customization.

ServeAgent

Engineering

San Francisco, CA

3w ago

Software Engineer - Model APIs

Software Engineer role focused on optimizing and operating Model APIs for AI inference, involving distributed systems, model serving, and developer experience. The role emphasizes performance improvements, structured outputs, tool/function calling, and multi-modal serving.

ServeAgent

Engineering

San Francisco, CA

Oct '25

Engineering Manager - Model Performance

Engineering Manager for Model Performance at Baseten, a company providing inference infrastructure for AI companies. The role involves leading a team of engineers to optimize ML model inference and performance, focusing on production-level AI/ML solutions and scaling large models. Requires a strong engineering background, leadership experience, and expertise in ML performance optimization, with hands-on work in areas like TensorRT, PyTorch, and CUDA.

Serve

Engineering

San Francisco, CA

Sep '24

Software Engineer - Model Performance

Software Engineer focused on ML performance for LLM inference, optimizing techniques like quantization and speculative decoding, and debugging ML performance issues in libraries like TensorRT and PyTorch.

Serve

Engineering

San Francisco, CA

Mar '24

Solution Architect (AI/LLM Inference)

Solution Architect role focused on AI/LLM inference, partnering with Sales and customers to design and deploy technical solutions. Responsibilities include customer discovery, technical scoping, leading demos, managing deployments, and driving POC execution. Requires an AI/ML background and customer-facing communication skills, with the ability to script and prototype.

Serve

Engineering

San Francisco, CA

2d ago

Applied AI Inference Engineer

This role focuses on partnering with customers to architect, build, and deploy high-scale production AI applications on Baseten's platform. It involves owning the customer journey from exploration to deployment, translating business goals into reliable, observable services with clear quality, latency, and cost outcomes. The role blends engineering, product management, technical customer success, and pre-sales solution engineering.

ServeAgent

Engineering

San Francisco, CA

3w ago

AI Solutions Engineer

AI Solutions Engineer role focused on partnering with customers to architect, build, and deploy high-scale production AI applications on Baseten's platform. This involves owning the customer journey from exploration to production, translating business goals into reliable, observable services with clear quality, latency, and cost outcomes. The role blends engineering, product management, technical customer success, and pre-sales solution engineering.

ShipServe

Engineering

San Francisco, CA

3w ago

Solution Architect

Solution Architect role at Baseten, a company providing AI inference infrastructure. The role involves partnering with Sales and customers to understand business needs, design technical solutions, run technical discovery, and guide deployments and proofs of value. Responsibilities include customer discovery calls, technical scoping, leading demos, owning benchmarking and repeatable deployments across various AI modalities, advising on infrastructure tradeoffs, and driving POC execution. Requires an AI/ML background, strong customer-facing communication, and technical depth to scope solutions.

Serve

Engineering

San Francisco, CA

Feb 25

Software Engineer - AI Enablement

Baseten is seeking an AI Enablement Engineer to own and develop AI-powered tooling and agent infrastructure for internal productivity. This role involves evaluating, customizing, and deploying AI coding agents and building custom internal agents for tasks like incident triage and codebase Q&A. The engineer will also track usage, measure impact, and stay updated on AI tooling advancements to enhance the engineering organization's effectiveness across the SDLC.

Agent

Engineering

San Francisco, CA

Feb 24

Software Engineer — GPU Networking & Distributed Systems

Software Engineer focused on GPU Networking and Distributed Systems to optimize AI inference infrastructure, specifically for LLMs and multi-modal models. The role involves integrating RDMA, optimizing networking layers for disaggregated KV cache and WideEP, enabling fast startup speeds, and building observability tools for bleeding-edge hardware.

Serve

Engineering

San Francisco, CA

Feb 23

Software Engineer - Training Product

Software Engineer focused on building and shipping training products for AI companies, working across the full stack from API to infrastructure, including fine-tuning models and partnering with research engineers. The role involves developing features like multi-node training and serverless RL, with a focus on developer experience and reliability.

Post-trainServe

Engineering

San Francisco, CA

Jan 22

Software Engineer, Model Performance Systems

Software Engineer role focused on building and optimizing the performance of AI inference infrastructure, including benchmarking, hardware profiling, and developing automated testing and monitoring tools for LLMs.

Serve

Engineering

San Francisco, CA

Jan 7

Software Engineer - Training Infrastructure

Software Engineer on the Training Infrastructure team responsible for architecting and leading development of the ML training platform, focusing on scheduling, storage, networking, reliability, and observability for research engineers and model developers.

Data

Engineering

San Francisco, CA

Aug '25

Software Engineer - Infrastructure

Software Engineer focused on building and maintaining the ML inference platform, enabling high-performance deployment, scaling, and monitoring of AI models for production applications.

Serve

Engineering

San Francisco, CA

Mar '25

Software Engineer - Core Product

Software Engineer on the Core Product team at Baseten, building and maintaining the core Baseten product that enables users to deploy and get value from ML models. The role involves working across the stack, including CLI tools, REST APIs, and the web application, with a focus on new feature development, API design, and bug fixing. Example initiatives include chains for multi-component workflows, asynchronous inference, model APIs, and model training for production inference.

ServeAgent

Engineering

San Francisco, CA

Jul '24

Forward Deployed Engineer

The Forward Deployed Engineer partners with customers to architect, build, and deploy high-scale production AI applications on Baseten's platform. This role involves owning the customer journey from exploration to production, translating business goals into reliable services with clear quality, latency, and cost outcomes. It blends engineering, product management, technical customer success, and pre-sales solution engineering.

ServeAgent

Engineering

San Francisco, CA

Mar '24