Principal Software Engineer, Ads Infrastructure

Unity Unity · Enterprise · San Francisco, CA · Engineering

Principal Software Engineer role focused on building and operating core distributed systems for Unity's Ads Infrastructure, a large real-time advertising platform. The role involves defining architecture, owning critical domains, driving operational excellence, and partnering on long-term strategy for scalable, cost-efficient, and highly available systems.

What you'd actually do

  1. Define and guide the architecture and design of large scale distributed systems, ensuring scalability, fault tolerance, cost efficiency, and long term maintainability across multi region deployments
  2. Take end to end ownership of critical infrastructure domains, operating services in production, participating in on call rotations, and leading deep root cause analysis of complex cross system incidents
  3. Drive operational excellence by establishing SLO frameworks, advancing observability and resiliency patterns, and automating reliability and capacity management across the platform
  4. Elevate engineering standards through rigorous design reviews, architectural alignment, performance benchmarking, and mentorship of senior and staff engineers
  5. Partner with leadership on long term infrastructure strategy, including capacity planning, cost optimization, and technical input to cloud vendor negotiation.

Skills

Required

  • designing, building, and operating large scale distributed systems
  • high availability production environments
  • production grade backend services and infrastructure systems
  • Kubernetes
  • cloud native architectures (GCP, AWS, or Azure)
  • multi cluster and multi region patterns
  • traffic management
  • load balancing
  • networking
  • messaging systems (e.g., Kafka)
  • large scale stream or batch data processing

Nice to have

  • distributed caching
  • storage engines
  • stateful systems supporting low latency global workloads
  • service mesh
  • routing
  • service discovery architectures
  • online advertising and adtech systems
  • real time bidding scale
  • distributed systems concepts (replication, partitioning, consistency models, failure handling, performance optimization)

What the JD emphasized

  • 10 or more years of experience designing, building, and operating large scale distributed systems in high availability production environments
  • Strong programming expertise with a track record of delivering production grade backend services and infrastructure systems
  • Deep hands on experience with Kubernetes and cloud native architectures in GCP, AWS, or Azure, including multi cluster and multi region patterns
  • Advanced expertise in distributed infrastructure including traffic management, load balancing, networking, messaging systems such as Kafka, and large scale stream or batch data processing