Staff Software Engineer, AI Platform

Harvey Harvey · AI Frontier · San Francisco, CA · Engineering

Staff Software Engineer, AI Platform at Harvey, focusing on building the foundational AI platform for agentic AI systems in the legal domain. Responsibilities include context engineering, agent infrastructure, model integration/routing, evaluation infrastructure, and shared abstractions to enable product teams. Requires experience in backend systems, AI/ML engineering, and shipping multi-model AI systems.

What you'd actually do

  1. Design and build abstractions and platform-level systems that improve all of Harvey’s agentic products.
  2. Own infrastructure for model integration, routing, and evaluation that helps Harvey choose and deploy the right foundation model for any given context.
  3. Build evaluation frameworks and tooling that let every team across Harvey iterate on AI quality effectively.
  4. Partner closely with product engineering teams, PMs, and design to launch cutting-edge AI products.
  5. Evaluate, prototype, and integrate the latest advancements in AI and agentic systems as they emerge.

Skills

Required

  • 8+ years of experience building backend systems
  • 1+ year focused on AI/ML engineering
  • track record of technical leadership across teams
  • Experience building and shipping multi-model or multi-provider AI systems in production
  • Familiarity with context management, session state, or memory systems in AI or distributed systems
  • A track record of building internal platforms, SDKs, or shared infrastructure
  • Strong judgment about abstractions
  • Opinionated about good design but pragmatic about shipping incrementally
  • Excitement about agentic AI and the infrastructure challenges of making autonomous systems reliable when the stakes are real
  • A bias toward full ownership
  • navigate ambiguity well

Nice to have

  • experience building evaluation frameworks
  • working with agent/function-calling architectures
  • familiarity with legal or other high-stakes professional services domains
  • time at early-stage or hyper-growth startups where the underlying technology changes regularly

What the JD emphasized

  • AI Platform
  • agent infrastructure
  • model routing
  • evals
  • multi-model or multi-provider AI systems in production
  • evaluation frameworks
  • agent/function-calling architectures

Other signals

  • AI Platform
  • Agent Infrastructure
  • Model Integration & Routing
  • Evaluation Infrastructure
  • Shared Abstractions
  • multi-model or multi-provider AI systems in production