About Us:

At Fireworks, we’re building the future of generative AI infrastructure. Our platform delivers the highest-quality models with the fastest and most scalable inference in the industry. We’ve been independently benchmarked as the leader in LLM inference speed and are driving cutting-edge innovation through projects like our own function calling and multimodal models. Fireworks is a Series C company valued at $4 billion and backed by top investors including Benchmark, Sequoia, Lightspeed, Index, and Evantic. We’re an ambitious, collaborative team of builders, founded by veterans of Meta PyTorch and Google Vertex AI.

We are seeking a Member of Technical Staff, Evals & Post-Training Product to help define how developers improve models on Fireworks. This role sits at the intersection of product engineering, developer experience, and model quality.

You will build the products and workflows that connect evaluation and post-training into a continuous loop: helping internal teams run evals at scale, enabling external developers through our open-source Eval Protocol SDK, and owning key product experiences for fine-tuning custom models on Fireworks.

You will work across the stack—from APIs, SDKs, and backend systems to user-facing product surfaces in the web app—to make it easier for users to author evals, understand results, fine-tune models, and iterate quickly. You will also work directly with customers and internal teams to identify friction, support real-world use cases, and turn repeated pain points into reusable product capabilities.

Key Responsibilities:

Build internal eval workflows: Design and scale evaluation tooling used by internal teams to measure model quality, compare model changes, and inform post-training decisions.
Own fine-tuning product experiences: Build and improve user-facing product workflows for post-training, including fine-tuning experiences across SFT, RFT, and related model-improvement capabilities.
Work closely with users: Partner with customers and internal stakeholders to understand evaluation and fine-tuning needs, support high-priority engagements, triage issues, and convert bespoke workflows into productized solutions.

Minimum Requirements:

1 - 7 years of software engineering experience (We are hiring at multiple levels for this role).
Hands-on experience with LLM evaluations and/or post-training methods: How to design useful evals and use their results to guide model improvement.
**Product Engineering Skills: **The ability to work across backend systems and developer-facing product surfaces. Comfortable shipping full-stack features when needed.
Understanding of the GenAI Lifecycle: You understand the end-to-end workflow—from prompting a base model to curating a dataset, fine-tuning, and productionizing agents—and how these steps interconnect.
User-Centric Mindset: Willing to talk to users, triage GitHub issues for open-source projects, and build products from scratch to serve emerging needs.

Preferred Qualifications:

3+ years of software engineering experience.
Domain-Specific Evaluation Experience: Strong familiarity with designing and running evaluations for domain-specific use cases (e.g. medical, legal, coding, or custom internal datasets).
Open Source Contributions: Prior contributions to developer tools or AI/ML repositories.
Inference & Hardware Knowledge: Interest in the hardware side of AI—understanding GPU constraints, inference optimization techniques, and how they relate to model performance.
Startup DNA: Experience in fast-paced environments where you own features end-to-end.

Why Fireworks AI?

Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.
Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.
Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.
Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.

Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.

About Us:

Key Responsibilities:

Build internal eval workflows: Design and scale evaluation tooling used by internal teams to measure model quality, compare model changes, and inform post-training decisions.
Own fine-tuning product experiences: Build and improve user-facing product workflows for post-training, including fine-tuning experiences across SFT, RFT, and related model-improvement capabilities.
Work closely with users: Partner with customers and internal stakeholders to understand evaluation and fine-tuning needs, support high-priority engagements, triage issues, and convert bespoke workflows into productized solutions.

Minimum Requirements:

1 - 7 years of software engineering experience (We are hiring at multiple levels for this role).
Hands-on experience with LLM evaluations and/or post-training methods: How to design useful evals and use their results to guide model improvement.
**Product Engineering Skills: **The ability to work across backend systems and developer-facing product surfaces. Comfortable shipping full-stack features when needed.
Understanding of the GenAI Lifecycle: You understand the end-to-end workflow—from prompting a base model to curating a dataset, fine-tuning, and productionizing agents—and how these steps interconnect.
User-Centric Mindset: Willing to talk to users, triage GitHub issues for open-source projects, and build products from scratch to serve emerging needs.

Preferred Qualifications:

3+ years of software engineering experience.
Domain-Specific Evaluation Experience: Strong familiarity with designing and running evaluations for domain-specific use cases (e.g. medical, legal, coding, or custom internal datasets).
Open Source Contributions: Prior contributions to developer tools or AI/ML repositories.
Inference & Hardware Knowledge: Interest in the hardware side of AI—understanding GPU constraints, inference optimization techniques, and how they relate to model performance.
Startup DNA: Experience in fast-paced environments where you own features end-to-end.

Why Fireworks AI?

Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.
Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.
Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.
Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.

Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.

Member of Technical Staff, Evals & Post-training Product

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

About Us:

Key Responsibilities:

Minimum Requirements:

Preferred Qualifications:

Why Fireworks AI?

About Us:

Key Responsibilities:

Minimum Requirements:

Preferred Qualifications:

Why Fireworks AI?