Staff Backend Engineer, ML Inference Systems

Unity Unity · Enterprise · Mountain View, CA · AI & Machine Learning

Staff Backend Engineer focused on building and operating the infrastructure for ML models that govern ad ranking and bidding decisions across billions of daily impressions. The role involves designing and operating distributed systems for large-scale online model inference, emphasizing performance, reliability, and scalability of inference systems.

What you'd actually do

  1. Design, develop, and deploy production-grade backend services and distributed systems powering large-scale online model inference at billions of daily requests
  2. Drive technical direction of our inference platform, with a focus on low-latency, high-throughput serving infrastructure
  3. Partner with ML engineers to ensure online serving infrastructure scales with growing model complexity and inference volumes, without compromising latency or throughput
  4. Ensure the reliability, scalability, and efficiency of our systems in production using monitoring and observability tools like Prometheus and Grafana.
  5. Manage and optimize cloud infrastructure on GCP, orchestrating workloads with Kubernetes across a high-scale production environment

Skills

Required

  • 5+ years designing, deploying, and maintaining distributed systems at scale
  • Expertise in Golang for building high-performance, low-latency backend infrastructure
  • Hands-on experience with cloud infrastructure on GCP and workload orchestration with Kubernetes
  • Strong grounding in monitoring and observability tooling, including Prometheus and Grafana
  • Experience in ad tech, recommender systems, real-time personalization, or other performance-critical domains
  • Familiarity with microservice architectures, containerization (Docker), and CI/CD best practices
  • Familiarity with machine learning platforms, workflows, and serving infrastructure

Nice to have

  • Experience with ML inference servers like NVIDIA Triton Inference Server.
  • Familiarity with auction mechanics or bidding systems in an ad tech context.
  • Experience embracing AI as a strategic advantage in engineering, following established best practices for code quality and security.

What the JD emphasized

  • billions of daily impressions
  • large-scale machine learning
  • real-world impact converge at scale
  • billions of daily requests
  • low-latency, high-throughput serving infrastructure

Other signals

  • billions of daily impressions
  • large-scale machine learning
  • real-world impact converge at scale
  • billions of daily requests
  • low-latency, high-throughput serving infrastructure