Senior Machine Learning Engineer - Platform

Samsara Samsara · Enterprise · San Francisco, CA · Remote · Platform

Senior Machine Learning Engineer to lead the architectural evolution of safety systems, moving from siloed models to a unified Perception Platform Layer. This role involves building robust infrastructure for real-time ML systems across cloud and edge, focusing on performance, reliability, and rigorous evaluation for safety-critical applications.

What you'd actually do

  1. Architect a Unified Perception Layer: Lead the transition from fragmented, task-specific models to a modular perception platform that supports reusable components and downstream safety applications.
  2. System Design: Design and implement real-time ML systems—from sensor ingestion and tracking to risk reasoning and actuation—ensuring clear interfaces and predictable system behavior.
  3. Hybrid Deployment: Orchestrate model integration across edge and cloud environments, managing versioning, rollouts, and mission-critical fallback mechanisms.
  4. Latency Ownership: Own end-to-end latency and reliability for safety-critical pipelines. You will profile, schedule, and optimize messaging and backpressure across the entire stack.
  5. Observability & Feedback Loops: Build sophisticated monitoring for deployed models to detect drift, false positives/negatives, and latency regressions. You will "close the loop" to ensure production data informs the next iteration of training.

Skills

Required

  • 6+ years of experience in ML Engineering
  • shipping models in production
  • safety-critical domains
  • distributed systems
  • performance profiling
  • computer vision
  • Cloud ML workflows (AWS/GCP/Azure)
  • containerization
  • edge hardware constraints
  • system design

Nice to have

  • Ph.D. in Computer Science or quantitative discipline
  • containerization technologies (e.g., Docker, Kubernetes)
  • CI/CD pipelines
  • infrastructure-as-code (IaC) frameworks
  • deploying and managing ML applications in cloud environments
  • leveraging cloud-based services for data storage, processing, and inference
  • building end-to-end ML applications

What the JD emphasized

  • proven track record of shipping models in production (ideally in safety-critical domains like robotics, automotive, or industrial AI)
  • Deep understanding of distributed systems, performance profiling, and computer vision
  • Architectural Mindset: You don't just write code; you design systems.

Other signals

  • architect a unified perception layer
  • real-time ML systems
  • hybrid deployment across edge and cloud
  • latency ownership for safety-critical pipelines
  • observability for deployed models
  • evaluation frameworks for rare safety events
  • explainability for production code