Cloud Software Engineer

ClickHouse ClickHouse · Data AI · Product & Engineering

ClickHouse is seeking an experienced Cloud Software Engineer to join their Observability team. This role involves designing, building, and operating distributed systems for telemetry and observability, ensuring reliability, performance, and cost-efficiency. The engineer will participate in on-call rotations, build automation, and contribute to the team's roadmap and collaborations with other engineering teams. The role requires strong production debugging skills, a problem-solving mindset, and experience with systems-level languages, cloud providers, Kubernetes, and observability tools.

What you'd actually do

  1. Design, build, and operate distributed systems that power observability across ClickHouse Cloud
  2. Own reliability, performance, and cost-efficiency of our telemetry pipeline and storage systems
  3. Take part in the on-call rotation and help drive root-cause resolution and long-term fixes
  4. Build tooling and automation to eliminate repetitive operational work
  5. Help shape the roadmap for observability by identifying bottlenecks and scaling challenges

Skills

Required

  • 5+ years building and running production systems at scale
  • Proficiency in at least one systems-level language (we use Go, but C++, Rust, Python, etc. are fine)
  • Experience with Kubernetes, Helm, ArgoCD, and Terraform or similar IaC tools
  • Comfortable working with at least one major cloud provider (AWS, GCP, Azure)
  • Familiarity with OpenTelemetry, Prometheus, Grafana, or similar tools
  • Experience with ClickHouse

Nice to have

  • Great production debugging skills
  • problem-solving mindset
  • Strong communication skills
  • Experience balancing system performance, reliability, and cost
  • Ability to iterate quickly: build MVPs, collect feedback, and improve continuously

What the JD emphasized

  • 5+ years building and running production systems at scale