Cloud Software Engineer - Observability Platform

ClickHouse ClickHouse · Data AI · United States · Engineering

The Cloud Software Engineer will design, build, and operate distributed systems for ClickHouse's observability platform, which handles trillions of events daily. This role focuses on the reliability, performance, and cost-efficiency of telemetry pipelines and storage systems, including participation in on-call rotations and building automation. The position requires strong production debugging skills, experience with Golang, Kubernetes, cloud providers, and observability tools like OpenTelemetry and Prometheus.

What you'd actually do

  1. Design, build, and operate distributed systems that power observability across ClickHouse Cloud
  2. Own reliability, performance, and cost-efficiency of our telemetry pipeline and storage systems
  3. Take part in the on-call rotation and help drive root-cause resolution and long-term fixes
  4. Build tooling and automation to eliminate repetitive operational work
  5. Help shape the roadmap for observability by identifying bottlenecks and scaling challenges

Skills

Required

  • Golang
  • Kubernetes
  • Helm
  • ArgoCD
  • Terraform
  • AWS
  • GCP
  • Azure
  • OpenTelemetry
  • Prometheus
  • Grafana

Nice to have

  • ClickHouse

What the JD emphasized

  • 5+ years building and running production systems at scale
  • production debugging skills
  • remote, async-friendly team