Solutions Architect at Fireworks AI

About Us:

At Fireworks, we’re building the future of generative AI infrastructure. Our platform delivers the highest-quality models with the fastest and most scalable inference in the industry. We’ve been independently benchmarked as the leader in LLM inference speed and are driving cutting-edge innovation through projects like our own function calling and multimodal models. Fireworks is a Series C company valued at $4 billion and backed by top investors including Benchmark, Sequoia, Lightspeed, Index, and Evantic. We’re an ambitious, collaborative team of builders, founded by veterans of Meta PyTorch and Google Vertex AI.

In the last few months alone we've launched the Fireworks Training Platform, partnered with Microsoft Azure Foundry, and published research straight from our production systems which is helping scale some of the most innovative companies and products of our generation.

As an SA you'll be close to all of it. The customer conversations you lead directly feed our roadmap, and the work you do shows up in what we build and publish next. A few examples of what that looks like in practice:

Frontier RL is cheaper than the mega-cluster narrative suggests — we ran cross-region rollouts using 98% sparse weight deltas and published what we learned
Training-inference parity in MoE models — kernel fusions that are mathematically equivalent can still drift numerically; we shipped the fixes across Kimi K2.5 and Qwen3.5-MoE
The fine-tuning bottleneck isn't the algorithm — integration friction and iteration speed are what actually stall teams; we documented the patterns across dozens of customer engagements

If you want to work on hard infrastructure problems, be close to the customers pushing the frontier, and actually see your work ship come work with us!

The Role:

Solutions Architects at Fireworks are the technical and strategic owners of the customer relationship from the first discovery call through to production. You'll work with some of the most ambitious engineering teams in the world, translating complex business problems into concrete AI solutions built on the Fireworks platform.

This is a role that demands both technical depth and strong people skills. You'll need to earn the trust of ML engineers and VPs in the same meeting, scope and execute POCs without losing sight of the customer's definition of success, and know enough about inference, fine-tuning, and model architecture to make credible recommendations under pressure.

We hire SAs across two tracks. Both require strong technical grounding and sharp customer instincts; the difference is where each track places its emphasis.

Enterprise SA Track

Works with digital native and large organizations — navigating multiple stakeholders, procurement cycles, and executive relationships
Heavy emphasis on executive presence: equally comfortable presenting to a CTO and debugging a latency issue with an ML engineer
Leads complex technical sales: discovery, solution design, POC execution, commercial negotiation
Owns the account relationship end-to-end, including expansion and renewal
Strong commercial instincts understands how to build a business case and close large deals

Applied AI Track

Works with high-velocity accounts and technology partners startups, ISVs, and hyperscaler ecosystems
Heavier emphasis on technical execution more time in the code, building integrations, running enablements
Faster iteration cycles with less org navigation focused on shipping working solutions quickly
Embeds with partner engineering teams to enable their AI practices and build joint solutions
Comfortable operating across engineering, partnerships, and sales simultaneously

What You'll Work On:

Regardless of track, SAs at Fireworks own a consistent set of responsibilities:

Technical Discovery & Solution Design

Lead structured discovery conversations to unpack customer pain points, constraints, and success criteria before proposing solutions
Design end-to-end architectures for GenAI applications covering model selection, inference configuration, RAG design, and fine-tuning strategy

POC Scoping & Execution

Define what a minimal, compelling proof-of-concept looks like and own it through to delivery. Prioritize and stack rank opportunities: manage scope creep, set realistic timelines, and keep the customer aligned on what success looks like
Work alongside product and engineering teams to execute technically rigorous POCs

Performance Engineering

Run inference sweeps and establish performance baselines for customer workloads
Create and configure deployments tuned to specific latency, throughput, and cost targets

Fine-Tuning & Model Recommendations

Guide customers on fine-tuning strategy and model recommendations: when to use SFT, DPO, or RFT, and which model family fits their use case
Build and run fine-tuning pipelines directly for customers
Evaluate model quality and help customers build robust eval pipelines

Account Ownership & Stakeholder Management

Own the technical relationship across the account: from champion to executive sponsor
Navigate complex organizations, build trust at multiple levels, and maintain momentum through long sales cycles
Feed customer signal: deployment patterns, pain points, feature gaps — back into the product roadmap

What We're Looking For

5+ years in a technical, customer-facing role — Solutions Architect, Sales Engineer, Forward Deployed Engineer, Customer facing AI Engineer / Data Scientist or equivalent
Hands-on experience with the LLM stack: inference trade-offs, fine-tuning methodologies (SFT, RFT, DPO), and deploying models at scale
Strong Python skills: comfortable reading, writing, and debugging production code
Exceptional communication: able to run a sharp discovery call, present to a VP, and explain reinforcement learning to an ML engineer in the same afternoon
Experience with cloud infrastructure (AWS, Azure, GCP) and model serving at scale

Why Fireworks AI?

Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.
Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.
Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.
Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.

Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.

About Us:

Frontier RL is cheaper than the mega-cluster narrative suggests — we ran cross-region rollouts using 98% sparse weight deltas and published what we learned
Training-inference parity in MoE models — kernel fusions that are mathematically equivalent can still drift numerically; we shipped the fixes across Kimi K2.5 and Qwen3.5-MoE
The fine-tuning bottleneck isn't the algorithm — integration friction and iteration speed are what actually stall teams; we documented the patterns across dozens of customer engagements

If you want to work on hard infrastructure problems, be close to the customers pushing the frontier, and actually see your work ship come work with us!

The Role:

We hire SAs across two tracks. Both require strong technical grounding and sharp customer instincts; the difference is where each track places its emphasis.

Enterprise SA Track

Works with digital native and large organizations — navigating multiple stakeholders, procurement cycles, and executive relationships
Heavy emphasis on executive presence: equally comfortable presenting to a CTO and debugging a latency issue with an ML engineer
Leads complex technical sales: discovery, solution design, POC execution, commercial negotiation
Owns the account relationship end-to-end, including expansion and renewal
Strong commercial instincts understands how to build a business case and close large deals

Applied AI Track

Works with high-velocity accounts and technology partners startups, ISVs, and hyperscaler ecosystems
Heavier emphasis on technical execution more time in the code, building integrations, running enablements
Faster iteration cycles with less org navigation focused on shipping working solutions quickly
Embeds with partner engineering teams to enable their AI practices and build joint solutions
Comfortable operating across engineering, partnerships, and sales simultaneously

What You'll Work On:

Regardless of track, SAs at Fireworks own a consistent set of responsibilities:

Technical Discovery & Solution Design

Lead structured discovery conversations to unpack customer pain points, constraints, and success criteria before proposing solutions
Design end-to-end architectures for GenAI applications covering model selection, inference configuration, RAG design, and fine-tuning strategy

POC Scoping & Execution

Define what a minimal, compelling proof-of-concept looks like and own it through to delivery. Prioritize and stack rank opportunities: manage scope creep, set realistic timelines, and keep the customer aligned on what success looks like
Work alongside product and engineering teams to execute technically rigorous POCs

Performance Engineering

Run inference sweeps and establish performance baselines for customer workloads
Create and configure deployments tuned to specific latency, throughput, and cost targets

Fine-Tuning & Model Recommendations

Guide customers on fine-tuning strategy and model recommendations: when to use SFT, DPO, or RFT, and which model family fits their use case
Build and run fine-tuning pipelines directly for customers
Evaluate model quality and help customers build robust eval pipelines

Account Ownership & Stakeholder Management

Own the technical relationship across the account: from champion to executive sponsor
Navigate complex organizations, build trust at multiple levels, and maintain momentum through long sales cycles
Feed customer signal: deployment patterns, pain points, feature gaps — back into the product roadmap

What We're Looking For

5+ years in a technical, customer-facing role — Solutions Architect, Sales Engineer, Forward Deployed Engineer, Customer facing AI Engineer / Data Scientist or equivalent
Hands-on experience with the LLM stack: inference trade-offs, fine-tuning methodologies (SFT, RFT, DPO), and deploying models at scale
Strong Python skills: comfortable reading, writing, and debugging production code
Exceptional communication: able to run a sharp discovery call, present to a VP, and explain reinforcement learning to an ML engineer in the same afternoon
Experience with cloud infrastructure (AWS, Azure, GCP) and model serving at scale

Why Fireworks AI?

Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.
Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.
Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.
Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.

Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.

Solutions Architect

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

About Us:

Why Fireworks AI?

About Us:

Why Fireworks AI?