Data Engineer

Baseten · Data AI · San Francisco, CA · G&A

Data Engineer to build and scale Baseten's internal data platform, transforming raw product and business data into reliable datasets that power decision-making. This role will design data models, pipelines, and analytics infrastructure, working with AI inference, infrastructure, and observability data to generate insights.

What you'd actually do

  1. Design and maintain core data models and semantic layers
  2. Develop and orchestrate batch and streaming data pipelines using technologies such as Apache Beam, Kafka, Airflow, or similar frameworks
  3. Analyze inference and infrastructure telemetry, including data from OpenTelemetry, Grafana, and other observability tools
  4. Define and maintain company-wide metrics across product usage, performance, and customer lifecycle
  5. Enable self-service analytics through agents and tools, with well-structured semantic layers and context

Skills

Required

  • Apache Beam
  • Kafka
  • Airflow
  • OpenTelemetry
  • Grafana
  • data reliability
  • data quality
  • data governance

Nice to have

  • inference metrics
  • latency
  • throughput
  • token usage
  • model performance
  • B2B SaaS
  • consumption-based platforms
  • forecasting
  • predictive modeling
  • ARIMA
  • Prophet

What the JD emphasized

  • inference metrics
  • consumption-based

Other signals

  • building and scaling internal data platform
  • transforming raw product and business data into reliable datasets
  • working with AI inference, infrastructure, and observability data