Software Engineer - New Products

Baseten · Data AI · San Francisco, CA · EPD

Software Engineer role focused on building new products for an AI inference platform. The role involves owning infrastructure capabilities like API gateways, auth, quotas, metering, and observability, with a focus on low-latency, reliable backend services and developer-friendly APIs. While not directly building ML models, the role is core to enabling AI companies to ship and operate their AI products at scale.

What you'd actually do

  1. Own and lead projects and product areas end-to-end, including architecture, implementation, rollout, and long-term operations.
  2. Design ergonomic, developer-friendly APIs and abstractions for infrastructure capabilities.
  3. Build and operate reliable backend services (rate limiting, auth, quotas, metering, migrations) with clear SLOs.
  4. Drive performance and reliability improvements through profiling, tracing, load testing, and capacity planning.
  5. Mentor teammates through code reviews, design docs, and technical leadership.

Skills

Required

  • Proven track record owning low-latency, reliable services (auth, rate limiting, quotas, usage metering, migrations).
  • Strong infrastructure instincts: observability, incident response, SLOs, and capacity management.
  • Comfort working across the stack when needed (backend-first, but willing to dive into frontend/CLI to unblock the product).
  • Strong written communication, including clear design docs and effective cross-functional collaboration.
  • Interest in AI/ML infrastructure and willingness to learn (ML expertise not required).

Nice to have

  • Experience with API gateways, service meshes, Kubernetes, or distributed scheduling.
  • Experience building developer platforms: SDKs, CLIs, APIs, and self-serve workflows.
  • Experience with inference platforms, LLM runtimes, or performance-sensitive systems.
  • Familiarity with multi-tenant isolation patterns (fair queuing, noisy-neighbor controls, admission control).
  • Frontend experience (React/TypeScript) or strong product UX instincts for developer tools.

What the JD emphasized

  • low-latency, reliable services
  • observability
  • incident response
  • SLOs
  • capacity management

Other signals

  • AI infrastructure
  • inference
  • production SLOs
  • developer platforms