Senior Software Engineer, Compute Platform

Reddit Reddit · Consumer · United States · Remote · BE Platform

Senior Software Engineer for Reddit's Compute Platform team, focusing on infrastructure and software development for high-level orchestration of multi-cloud deployments and intra-cluster challenges. The role involves building APIs, controllers, and SDKs to automate global fleet lifecycle, optimize performance, resource packing, and hardware management, including GPUs. Responsibilities include designing and delivering Go solutions for availability, scalability, and latency, developing Kubernetes controllers/operators, building core tooling/SDKs, optimizing intra-cluster performance, and collaborating across the organization. Requires 4+ years of experience in internet-scale software, Go proficiency, Kubernetes expertise, Linux internals knowledge, and open-source contributions.

What you'd actually do

  1. Design and deliver software solutions in Go to improve the availability, scalability, and latency of Reddit’s compute infrastructure.
  2. Develop Kubernetes controllers and operators to automate cluster management, workload scheduling, and the reconciliation of complex system states.
  3. Build core tooling and SDKs that codify network configurations, managed services, and compute capacity tracking across a multi-region fleet.
  4. Optimize intra-cluster performance by developing reactive schedulers and detecting node-level characteristics to inform availability.
  5. Collaborate across the organization to provide technical feedback and automate critical development workflows and infrastructure operations.

Skills

Required

  • 4+ years of experience developing internet-scale software with a heavy focus on infrastructure and distributed systems.
  • Proficient in Go with a proven track record of building and managing Kubernetes services at scale.
  • An expert in Linux internals, including a solid understanding of multi-tenancy primitives like cgroups and namespaces.
  • A contributor to the open-source community, ideally within the infrastructure or CNCF domain.
  • A self-starter capable of troubleshooting complex, cross-system issues and managing large projects independently.
  • An excellent communicator who thrives in a collaborative, service-oriented environment.

What the JD emphasized

  • internet-scale software
  • Kubernetes services at scale
  • Linux internals
  • multi-tenancy primitives like cgroups and namespaces
  • open-source community, ideally within the infrastructure or CNCF domain