Software Engineer, Infrastructure

Whatnot · Consumer · Kraków, Poland · Engineering

Software Engineer, Infrastructure Reliability Engineering role at Whatnot, focusing on building and maintaining distributed systems, services, and frameworks to ensure platform reliability, resiliency, and safe operation at scale. Responsibilities include designing traffic control mechanisms, load testing frameworks, chaos testing infrastructure, and systems for SLOs and incident management. The role requires strong software engineering fundamentals and experience with distributed systems, with preferred languages being Python, Elixir, or Go.

What you'd actually do

  1. Designing and building distributed systems that support reliability, resiliency, and safe operation at scale
  2. Designing and operating traffic control mechanisms: circuit breakers, rate limiting, admission control, backpressure, and graceful degradation
  3. Building and evolving load testing frameworks that validate system behavior under sustained, burst, and peak event traffic patterns
  4. Building chaos and resilience testing infrastructure to proactively surface failure modes and validate recovery behavior
  5. Building systems that enable teams to define and implement SLOs, SLIs, and error budgets to guide reliability tradeoffs

Skills

Required

  • designing and building large-scale distributed systems
  • software engineer first
  • designing, building, and operating shared production services and frameworks
  • traffic control mechanisms such as circuit breakers and rate limiting
  • Building or operating load testing and chaos testing frameworks
  • Hands-on observability, monitoring, and debugging of production systems
  • SLOs, error budgets, and incident response processes
  • cloud-native environments such as AWS or GCP with Kubernetes and infrastructure as code
  • clear written and verbal communication skills

Nice to have

  • Experience in Python, Elixir, or Go
  • high-traffic, real-time, or event-driven systems
  • building developer-facing tools, frameworks, or platform libraries consumed by other engineering teams

What the JD emphasized

  • 5+ years of experience designing and building large-scale distributed systems