Applied AI Inference Engineer

Baseten · Data AI · San Francisco, CA · EPD

This role focuses on partnering with customers to architect, build, and deploy high-scale production AI applications on Baseten's platform. It involves owning the customer journey from exploration to deployment, translating business goals into reliable, observable services with clear quality, latency, and cost outcomes. The role blends engineering, product management, technical customer success, and pre-sales solution engineering.

What you'd actually do

  1. Develop and maintain software systems and product features using one or more general-purpose programming languages in a production-level environment, with a preference for Python due to its relevance in ML projects.
  2. Drive customer impact by designing, implementing, and deploying Baseten solutions end-to-end (problem framing → evaluation → production deployment → monitoring). This involves working with customers’ engineering teams at every stage of the customer journey including: sales, implementation, and expansion.
  3. Deliver with velocity: turn vague objectives into clear specs and well-defined PoCs so we can rapidly ship well-tested services and outcomes for our customers
  4. Optimize and enhance AI/ML projects, contributing to the continuous improvement of our technical stack. This includes developing features and PRDs with other engineering and product orgs.
  5. Own products and customer projects end-end, functioning as both an engineer, project manager, and product manager, with a focus on user empathy, project specification, and end-to-end execution.

Skills

Required

  • Python
  • production-level environment
  • AI/ML pipelines
  • ML model development and deployment
  • communication skills
  • complex technical topics

Nice to have

  • building or optimizing AI/ML projects

What the JD emphasized

  • high-scale
  • production deployment
  • quality, latency, and cost outcomes
  • customer impact
  • production-level environment
  • AI/ML pipelines
  • lifecycle of ML model development and deployment
  • building or optimizing AI/ML projects

Other signals

  • customer-facing
  • high-scale
  • production deployment
  • latency
  • cost