Solution Specialist, AI Runtime Services

Weights & Biases · Data AI · Bellevue, WA +4 · Global Field Organization

This role focuses on bringing new AI runtime services, such as model serving and sandboxes, to market. It involves driving initial customer adoption, gathering feedback for the product roadmap, and enabling sales teams to position these services. The role requires deep expertise in AI runtime infrastructure, including serving frameworks, inference optimization, and execution isolation.

What you'd actually do

Own the commercial and technical strategy for net new customer wins in AI runtime infrastructure, where execution performance, deployment flexibility, and operational reliability are the primary buying triggers.
Drive new business opportunities where inference latency, throughput bottlenecks, workload isolation requirements, or operational complexity are barriers to scaling AI on CoreWeave.
Develop deep expertise across the AI runtime landscape (model serving architectures, execution scheduling, containerized AI workloads, and secure multi-tenant compute), using CoreWeave's Inference and Sandboxes products as flagship examples of what best-in-class runtime looks like.
Translate customer requirements around serving frameworks (e.g., vLLM, TensorRT-LLM, TGI), batching strategies, and execution isolation into specific product feedback that shapes the AI Runtime Services roadmap.
Develop deal structures, technical playbooks, and benchmark narratives that help sales and SA teams accelerate runtime-sensitive opportunities across the full spectrum of AI deployment patterns.

Skills

Required

10+ years of experience in distributed systems, ML infrastructure, or production AI engineering
5+ years working with AI runtime systems (model serving, inference optimization, containerized workload execution, or real-time ML pipelines) in a customer-facing or deal-shaping capacity
Deep working knowledge of how AI workloads execute at runtime: serving frameworks, batching strategies, GPU memory management, and the performance levers that determine throughput and latency at scale
Experience with sandboxed and isolated execution environments (microVM architectures, container runtimes, secure multi-tenant scheduling)
Familiarity with Kubernetes-native runtime orchestration (autoscaling, scheduling policies, GPU operators)
Ability to benchmark, explain, and commercially position runtime performance differences

Nice to have

Experience driving new business or shaping product strategy in industries with high-throughput AI runtime demands, such as generative AI applications, autonomous systems, financial modeling, or developer platforms.
Prior background in technical sales, solution consulting, or product management supporting large-scale inference infrastructure or AI platform decisions.
Deep understanding of cost-per-token economics, inference fleet optimization, and the commercial tradeoffs between on-demand, reserved, and spot GPU capacity for runtime workloads.
Advanced degree in Computer Science, Machine Learning, or Engineering, or equivalent experience with a demonstrated ability to operate at the intersection of technical architecture and commercial strategy.

What the JD emphasized

AI runtime infrastructure
execution performance
deployment flexibility
operational reliability
inference latency
throughput bottlenecks
workload isolation requirements
operational complexity
model serving architectures
execution scheduling
containerized AI workloads
secure multi-tenant compute
serving frameworks
batching strategies
execution isolation
runtime performance tradeoffs
cost-per-token economics
architectural decisions
throughput modeling
GPU utilization commitments
SLA structures
serving efficiency
execution isolation
operational reliability
distributed systems
ML infrastructure
production AI engineering
customer outcomes
revenue
AI runtime systems
model serving
inference optimization
containerized workload execution
real-time ML pipelines
serving frameworks
batching strategies
GPU memory management
throughput
latency at scale
sandboxed and isolated execution environments
microVM architectures
container runtimes
secure multi-tenant scheduling
execution isolation requirements
GPU memory hierarchies
model parallelism strategies
runtime architecture decisions
cost
latency
scalability outcomes
Kubernetes-native runtime orchestration
autoscaling
scheduling policies
GPU operators
workload portability
operational complexity
platform stickiness
benchmark
commercially position runtime performance differences
deployment patterns
instance types
serving configurations
high-throughput AI runtime demands
generative AI applications
autonomous systems
financial modeling
developer platforms
technical sales
solution consulting
product management
large-scale inference infrastructure
AI platform decisions
cost-per-token economics
inference fleet optimization
GPU capacity
runtime workloads

Other signals

AI runtime services
model serving
inference platform
customer adoption
product roadmap

Read full job description

CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability. Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025. Learn more at www.coreweave.com.

What You'll Do:

As CoreWeave turns raw GPU capacity into production-grade AI services, it is launching new ways for customers to run, scale, and serve AI workloads, and those new services need a market-maker. As a Solution Specialist for AI Runtime Services, you open new market opportunities across the execution layer (high-throughput, low-latency model serving through our Inference platform, and secure, isolated execution through Sandboxes) and drive the initial adoption of these offerings with the earliest customers and industries to need them. You channel what you learn into the product roadmap and make the broader sales and solution architecture organization fluent in the value runtime services create for new customers.

About the role:

As a Solution Specialist for AI Runtime Services, you work at the leading edge of how CoreWeave brings new runtime offerings to market. Rather than running an existing motion, you create one: you take newly launched services like Inference and Sandboxes into new accounts and new industries, prove their value with the first wave of customers operationalizing AI at scale, and establish the playbooks sales and solution architects use to repeat those wins. You are the field's authoritative voice back to engineering, translating what early adopters need around serving frameworks, batching, and execution isolation into the priorities that shape the AI Runtime Services roadmap.

In this role, you will:

Own the commercial and technical strategy for net new customer wins in AI runtime infrastructure, where execution performance, deployment flexibility, and operational reliability are the primary buying triggers.
Drive new business opportunities where inference latency, throughput bottlenecks, workload isolation requirements, or operational complexity are barriers to scaling AI on CoreWeave.
Develop deep expertise across the AI runtime landscape (model serving architectures, execution scheduling, containerized AI workloads, and secure multi-tenant compute), using CoreWeave's Inference and Sandboxes products as flagship examples of what best-in-class runtime looks like.
Translate customer requirements around serving frameworks (e.g., vLLM, TensorRT-LLM, TGI), batching strategies, and execution isolation into specific product feedback that shapes the AI Runtime Services roadmap.
Develop deal structures, technical playbooks, and benchmark narratives that help sales and SA teams accelerate runtime-sensitive opportunities across the full spectrum of AI deployment patterns.
Engage directly with enterprise and research buyers as the authoritative voice on runtime performance tradeoffs, cost-per-token economics, and the architectural decisions that separate prototype deployments from production-scale AI systems.
Design the commercial framework for large-scale runtime deployment deals, including throughput modeling, GPU utilization commitments, and SLA structures that support enterprise closings.
Partner with product and infrastructure teams to maintain a competitive edge on serving efficiency, execution isolation, and operational reliability across active and prospective customer deployments.

Who You Are:

10+ years of experience in distributed systems, ML infrastructure, or production AI engineering, with a track record of applying that expertise to drive customer outcomes and revenue.
5+ years working with AI runtime systems (model serving, inference optimization, containerized workload execution, or real-time ML pipelines) in a customer-facing or deal-shaping capacity.
Deep working knowledge of how AI workloads execute at runtime: serving frameworks, batching strategies, GPU memory management, and the performance levers that determine throughput and latency at scale (with specific familiarity with products like vLLM, TensorRT-LLM, or Triton).
Experience with sandboxed and isolated execution environments (microVM architectures, container runtimes, secure multi-tenant scheduling) and how execution isolation requirements shape platform selection decisions.
Strong understanding of GPU memory hierarchies, model parallelism strategies, and how runtime architecture decisions translate into cost, latency, and scalability outcomes for enterprise customers.
Familiarity with Kubernetes-native runtime orchestration (autoscaling, scheduling policies, GPU operators) and how it impacts workload portability, operational complexity, and platform stickiness.
Ability to benchmark, explain, and commercially position runtime performance differences across deployment patterns, instance types, and serving configurations.

Preferred:

Experience driving new business or shaping product strategy in industries with high-throughput AI runtime demands, such as generative AI applications, autonomous systems, financial modeling, or developer platforms.
Prior background in technical sales, solution consulting, or product management supporting large-scale inference infrastructure or AI platform decisions.
Deep understanding of cost-per-token economics, inference fleet optimization, and the commercial tradeoffs between on-demand, reserved, and spot GPU capacity for runtime workloads.
Advanced degree in Computer Science, Machine Learning, or Engineering, or equivalent experience with a demonstrated ability to operate at the intersection of technical architecture and commercial strategy.

Wondering if you're a good fit?

Wondering if you're a good fit? We believe in investing in our people, and value candidates who can bring their own diversified experiences to our teams, even if you aren't a 100% skill or experience match. Here are a few qualities we've found compatible with our team. If some of this describes you, we'd love to talk.

You love translating boardroom goals into technical reality: you can explain high-level business strategy and low-level runtime performance tradeoffs in the same meeting.
You love acting as the bridge: between high-level sales strategy and deep ML infrastructure engineering.
You are an expert at navigating "The Room": you have a proven ability to manage complex, multi-stakeholder technical evaluations without losing momentum.

Why CoreWeave?

At CoreWeave, we work hard, have fun, and move fast! We're in an exciting stage of hyper-growth that you will not want to miss out on. We're not afraid of a little chaos, and we're constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values:

Be Curious at Your Core
Act Like an Owner
Empower Employees
Deliver Best-in-Class Client Experiences
Achieve More Together

We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and enables the development of innovative solutions to complex problems. As we get set for takeoff, the organization's growth opportunities are constantly expanding. You will be surrounded by some of the best talent in the industry, who will want to learn from you, too. Come join us!

The base salary range for this role is $207,000 to $275,000. The starting salary will be determined based on job-related knowledge, skills, experience, and market location. We strive for both market alignment and internal equity when determining compensation. In addition to base salary, our total rewards package includes a discretionary bonus, equity awards, and a comprehensive benefits program (all based on eligibility).

What We Offer

The range we’ve posted represents the typical compensation range for this role. To determine actual compensation, we review the market rate for each candidate which can include a variety of factors. These include qualifications, experience, interview performance, and location.

In addition to a competitive salary, we offer a variety of benefits to support your needs. The benefits below reflect our US-based offerings; for roles in other locations, benefits vary and are shared during the hiring process. These include:

Medical, dental, and vision insurance - 100% paid for by CoreWeave
Company-paid Life Insurance
Voluntary supplemental life insurance
Short and long-term disability insurance
Flexible Spending Account
Health Savings Account
Tuition Reimbursement
Ability to Participate in Employee Stock Purchase Program (ESPP)
Mental Wellness Benefits through Spring Health
Family-Forming support provided by Carrot
Paid Parental Leave
Flexible, full-service childcare support with Kinside
401(k) with a generous employer match
Flexible PTO
Catered lunch each day in our office and data center locations
A casual work environment
A work culture focused on innovative disruption

California Applicants

California Consumer Privacy Act

Equal Opportunity & Accommodations

CoreWeave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, sexual orientation, gender identity, national origin, veteran status, or genetic information.

As part of this commitment and consistent with the _Americans with Disabilities Act (ADA)_, CoreWeave will ensure that qualified applicants and candidates with disabilities are provided reasonable accommodations for the hiring process, unless such accommodation would cause an undue hardship. If reasonable accommodation is needed, please contact: careers@coreweave.com.

Export Control Compliance

This position requires access to export controlled information. To conform to U.S. Government export regulations applicable to that information, applicant must either be (A) a U.S. person, defined as a (i) U.S. citizen or national, (ii) U.S. lawful permanent resident (green card holder), (iii) refugee under 8 U.S.C. § 1157, or (iv) asylee under 8 U.S.C. § 1158, (B) eligible to access the export controlled information without a required export authorization, or (C) eligible and reasonably likely to obtain the required export authorization from the applicable U.S. government agency. CoreWeave may, for legitimate business reasons, decline to pursue any export licensing process.

What You'll Do:

About the role:

In this role, you will:

Own the commercial and technical strategy for net new customer wins in AI runtime infrastructure, where execution performance, deployment flexibility, and operational reliability are the primary buying triggers.
Drive new business opportunities where inference latency, throughput bottlenecks, workload isolation requirements, or operational complexity are barriers to scaling AI on CoreWeave.
Develop deep expertise across the AI runtime landscape (model serving architectures, execution scheduling, containerized AI workloads, and secure multi-tenant compute), using CoreWeave's Inference and Sandboxes products as flagship examples of what best-in-class runtime looks like.
Translate customer requirements around serving frameworks (e.g., vLLM, TensorRT-LLM, TGI), batching strategies, and execution isolation into specific product feedback that shapes the AI Runtime Services roadmap.
Develop deal structures, technical playbooks, and benchmark narratives that help sales and SA teams accelerate runtime-sensitive opportunities across the full spectrum of AI deployment patterns.
Engage directly with enterprise and research buyers as the authoritative voice on runtime performance tradeoffs, cost-per-token economics, and the architectural decisions that separate prototype deployments from production-scale AI systems.
Design the commercial framework for large-scale runtime deployment deals, including throughput modeling, GPU utilization commitments, and SLA structures that support enterprise closings.
Partner with product and infrastructure teams to maintain a competitive edge on serving efficiency, execution isolation, and operational reliability across active and prospective customer deployments.

Who You Are:

10+ years of experience in distributed systems, ML infrastructure, or production AI engineering, with a track record of applying that expertise to drive customer outcomes and revenue.
5+ years working with AI runtime systems (model serving, inference optimization, containerized workload execution, or real-time ML pipelines) in a customer-facing or deal-shaping capacity.
Deep working knowledge of how AI workloads execute at runtime: serving frameworks, batching strategies, GPU memory management, and the performance levers that determine throughput and latency at scale (with specific familiarity with products like vLLM, TensorRT-LLM, or Triton).
Experience with sandboxed and isolated execution environments (microVM architectures, container runtimes, secure multi-tenant scheduling) and how execution isolation requirements shape platform selection decisions.
Strong understanding of GPU memory hierarchies, model parallelism strategies, and how runtime architecture decisions translate into cost, latency, and scalability outcomes for enterprise customers.
Familiarity with Kubernetes-native runtime orchestration (autoscaling, scheduling policies, GPU operators) and how it impacts workload portability, operational complexity, and platform stickiness.
Ability to benchmark, explain, and commercially position runtime performance differences across deployment patterns, instance types, and serving configurations.

Preferred:

Experience driving new business or shaping product strategy in industries with high-throughput AI runtime demands, such as generative AI applications, autonomous systems, financial modeling, or developer platforms.
Prior background in technical sales, solution consulting, or product management supporting large-scale inference infrastructure or AI platform decisions.
Deep understanding of cost-per-token economics, inference fleet optimization, and the commercial tradeoffs between on-demand, reserved, and spot GPU capacity for runtime workloads.
Advanced degree in Computer Science, Machine Learning, or Engineering, or equivalent experience with a demonstrated ability to operate at the intersection of technical architecture and commercial strategy.

Wondering if you're a good fit?

You love translating boardroom goals into technical reality: you can explain high-level business strategy and low-level runtime performance tradeoffs in the same meeting.
You love acting as the bridge: between high-level sales strategy and deep ML infrastructure engineering.
You are an expert at navigating "The Room": you have a proven ability to manage complex, multi-stakeholder technical evaluations without losing momentum.

Why CoreWeave?

Be Curious at Your Core
Act Like an Owner
Empower Employees
Deliver Best-in-Class Client Experiences
Achieve More Together

What We Offer

Medical, dental, and vision insurance - 100% paid for by CoreWeave
Company-paid Life Insurance
Voluntary supplemental life insurance
Short and long-term disability insurance
Flexible Spending Account
Health Savings Account
Tuition Reimbursement
Ability to Participate in Employee Stock Purchase Program (ESPP)
Mental Wellness Benefits through Spring Health
Family-Forming support provided by Carrot
Paid Parental Leave
Flexible, full-service childcare support with Kinside
401(k) with a generous employer match
Flexible PTO
Catered lunch each day in our office and data center locations
A casual work environment
A work culture focused on innovative disruption

California Applicants

California Consumer Privacy Act

Equal Opportunity & Accommodations

Export Control Compliance

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

What You'll Do:

**About the role: **

**Who You Are: **

Preferred:

Wondering if you're a good fit?

Why CoreWeave?

What You'll Do:

**About the role: **

**Who You Are: **

Preferred:

Wondering if you're a good fit?

Why CoreWeave?

About the role:

Who You Are:

About the role:

Who You Are: