Member of Technical Staff, Evals & Post-training Product

Fireworks AI · Data AI · San Mateo, CA · Engineering

Fireworks AI is seeking a Member of Technical Staff, Evals & Post-Training Product to build products and workflows that integrate evaluation and post-training. This role involves developing internal eval tooling, owning fine-tuning product experiences, and collaborating with users to identify needs and productize solutions. The position requires software engineering experience, hands-on LLM evaluation/post-training experience, product engineering skills, and an understanding of the GenAI lifecycle.

What you'd actually do

  1. Build internal eval workflows: Design and scale evaluation tooling used by internal teams to measure model quality, compare model changes, and inform post-training decisions.
  2. Own fine-tuning product experiences: Build and improve user-facing product workflows for post-training, including fine-tuning experiences across SFT, RFT, and related model-improvement capabilities.
  3. Work closely with users: Partner with customers and internal stakeholders to understand evaluation and fine-tuning needs, support high-priority engagements, triage issues, and convert bespoke workflows into productized solutions.

Skills

Required

  • 1 - 7 years of software engineering experience
  • Hands-on experience with LLM evaluations and/or post-training methods
  • Product Engineering Skills
  • Understanding of the GenAI Lifecycle
  • User-Centric Mindset

Nice to have

  • 3+ years of software engineering experience
  • Domain-Specific Evaluation Experience
  • Open Source Contributions
  • Inference & Hardware Knowledge
  • Startup DNA

What the JD emphasized

  • Hands-on experience with LLM evaluations and/or post-training methods
  • Understanding of the GenAI Lifecycle

Other signals

  • building products and workflows that connect evaluation and post-training into a continuous loop
  • enabling external developers through our open-source Eval Protocol SDK
  • owning key product experiences for fine-tuning custom models on Fireworks
  • work across the stack—from APIs, SDKs, and backend systems to user-facing product surfaces in the web app—to make it easier for users to author evals, understand results, fine-tune models, and iterate quickly
  • work directly with customers and internal teams to identify friction, support real-world use cases, and turn repeated pain points into reusable product capabilities