Staff ML Engineer - ML Infrastructure

Samsara Samsara · Enterprise · San Francisco, CA · Remote · Safety AI

Staff ML Engineer focused on building and operating an end-to-end ML platform for industrial AI applications, covering training, experimentation, inference (cloud and edge), and deployment. The role emphasizes platform ownership, scalability, reliability, and enabling product teams to ship ML-powered features.

What you'd actually do

  1. Design, build, and operate Samsara’s end-to-end ML platform (training, experimentation, batch/online inference, edge) used by multiple Safety AI product teams.
  2. Evolve shared training and experimentation infrastructure (orchestration, clusters, environments) and standardize tracking, evaluation, and regression testing for fast, safe iteration.
  3. Partner with product and applied ML teams to ship ML-powered features (CV models, EcoDriving insights, LLM-based reporting) that improve safety, reliability, and cost efficiency.
  4. Design and operate scalable online and batch inference systems (Ray, Spark), including deployment patterns, observability, SLOs, and unified training-to-production workflows.
  5. Partner with firmware and edge teams to package, validate, and deploy models to Samsara devices, and build feedback loops from edge to cloud for continuous improvement.

Skills

Required

  • Machine learning engineering
  • Building and operating large-scale ML systems
  • Distributed computing frameworks (Ray, Spark)
  • Cloud infrastructure (AWS)
  • Containers/Kubernetes
  • Production observability tooling
  • ML platforms (training, experimentation, or inference)
  • ML fundamentals (evaluation, experiment design, model iteration)

Nice to have

  • Shipping ML-powered features end-to-end
  • Measurable impact on product or business metrics

What the JD emphasized

  • end-to-end ML platform
  • ML platform
  • training, experimentation, batch/online inference, edge
  • ship ML-powered features
  • scalable online and batch inference systems
  • deploy models to Samsara devices
  • 10+ years of overall experience in machine learning engineering or related fields, with a strong track record of building and operating large-scale ML systems.
  • Proven experience building or supporting ML platforms (training, experimentation, or inference) used by multiple teams.

Other signals

  • ML platform
  • training, experimentation, batch/online inference, edge
  • ship ML-powered features
  • scalable online and batch inference systems
  • deploy models to Samsara devices