Site Reliability Engineer (hosted Infra) - Platform

Elastic Elastic · Enterprise · United States · Platform - Cross Team

Site Reliability Engineer focused on automating and scaling multi-cloud infrastructure for Elastic Cloud, emphasizing reliability, IaC, and observability. The role involves building internal tools and services, managing host lifecycles, and contributing to a balanced SRE on-call rotation. While not directly building AI models, the role involves integrating AI tools to reduce operational burden.

What you'd actually do

  1. Engineering software to automate large-scale systems — building internal tools and services, not just running scripts.
  2. Optimizing the reliability and lifecycle of hosts across multiple cloud providers.
  3. Strengthening our observability posture — crafting alerting and monitoring systems that drive incident prevention over incident response.
  4. Scaling global infrastructure and evolving the infrastructure management processes to meet growing demand.
  5. Contributing to code reviews, sharing your work, planning what we need to do next, and both mentoring and being mentored by teammates.

Skills

Required

  • Golang
  • production experience operating large-scale cloud compute
  • Deep experience with Linux systems
  • Proficiency working with containerized workloads in production
  • customer-first, systems-thinking approach
  • clear and maintainable documentation

Nice to have

  • Terraform
  • Puppet
  • Ansible
  • Argo CD
  • Argo Workflows
  • CUE
  • Docker
  • Kubernetes
  • Ubuntu
  • Ubuntu Live Patch
  • on-call experience
  • observability tools (e.g. Elastic Stack, Graphite, Prometheus, Influx)
  • engineering solutions with the Elastic Stack

What the JD emphasized

  • building internal tools and services, not just running scripts
  • customer-first, systems-thinking approach
  • A sensible approach to AI integration — identifying where AI tools genuinely reduce operational burden and embedding them into workflows without adding complexity.