Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale — unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50% of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the results that matter. By taking advantage of all structured and unstructured data — securing and protecting private information more effectively — Elastic’s complete, cloud-based solutions for search, security, and observability help organizations deliver on the promise of AI.

What is the role

We are Cloud Infrastructure SREs that integrate, scale, and evolve multi-cloud infrastructure across 4 Cloud Service Providers, 70+ globally distributed regions, and tens of thousands of hosts to power Elastic Cloud. We tackle hard problems at scale through automation, Infrastructure as Code (IaC), configuration management, and purpose-built software that eliminates toil and improves reliability.

We're also a team that grows people as well as systems. If that challenge genuinely excites you, we'd love to hear from you.

What you will be doing

Engineering software to automate large-scale systems — building internal tools and services, not just running scripts.
Optimizing the reliability and lifecycle of hosts across multiple cloud providers.
Strengthening our observability posture — crafting alerting and monitoring systems that drive incident prevention over incident response.
Scaling global infrastructure and evolving the infrastructure management processes to meet growing demand.
Contributing to code reviews, sharing your work, planning what we need to do next, and both mentoring and being mentored by teammates.
Being part of a balanced SRE on-call rotation: responding to incidents, improving runbooks, participating in postmortems, and championing reliability improvements.

What you bring

Experience building software with Golang. You are also comfortable reviewing others' code and offering constructive feedback.
Production experience operating large-scale cloud compute (hundreds of hosts or more) via automated workflows.
Deep experience with Linux systems — you are at home in the terminal debugging at the OS level.
Proficiency working with containerized workloads in production.
A customer-first, systems-thinking approach to operational problems — you care about root causes, not just symptoms.
Comfortable working across time zones in both real-time and asynchronous contexts.
You contribute clear and maintainable documentation such as software designs, runbooks, architecture diagrams/decisions, postmortems, etc...
You communicate project status regularly and clearly, flag blockers early, and follow through on action items.
A sensible approach to AI integration — identifying where AI tools genuinely reduce operational burden and embedding them into workflows without adding complexity.

Bonus Points

Production experience with any of: Terraform, Puppet, Ansible, Argo CD, Argo Workflows, CUE, Docker, Kubernetes, Ubuntu, or Ubuntu Live Patch.
Experience being on-call during incidents and using observability tools (e.g. Elastic Stack, Graphite, Prometheus, Influx) to diagnose issues, quantify impact, and confirm mitigations.
Hands-on experience engineering solutions with the Elastic Stack.

Compensation for this role is in the form of base salary. This role does not have a variable compensation component.

The typical starting salary range for new hires in this role is listed below. In select locations (including Seattle WA, Los Angeles CA, the San Francisco Bay Area CA, and the New York City Metro Area), an alternate range may apply as specified below.

These ranges represent the lowest to highest salary we reasonably and in good faith believe we would pay for this role at the time of this posting. We may ultimately pay more or less than the posted range, and the ranges may be modified in the future.

An employee's position within the salary range will be based on several factors including, but not limited to, relevant education, qualifications, certifications, experience, skills, geographic location, performance, and business or organizational needs.

Elastic believes that employees should have the opportunity to share in the value that we create together for our shareholders. Therefore, in addition to cash compensation, this role is currently eligible to participate in Elastic's stock program. Our total rewards package also includes a company-matched 401k with dollar-for-dollar matching up to 6% of eligible earnings, along with a range of other benefits offered with a holistic emphasis on employee well-being.

The typical starting salary range for this role is:

$143,100—$175,000 USD

The typical starting salary range for this role in the select locations listed above is:

$143,100—$175,000 USD

Additional Information - We Take Care of Our People

As a distributed company, diversity drives our identity. Whether you’re looking to launch a new career or grow an existing one, Elastic is the type of company where you can balance great work with great life. Your age is only a number. It doesn’t matter if you’re just out of college or your children are; we need you for what you can do.

We strive to have parity of benefits across regions and while regulations differ from place to place, we believe taking care of our people is the right thing to do.

Competitive pay based on the work you do here and not your previous salary
Health coverage for you and your family in many locations
Ability to craft your calendar with flexible locations and schedules for many roles
Generous number of vacation days each year
Increase your impact - We match up to $2000 (or local currency equivalent) for financial donations and service
Up to 40 hours each year to use toward volunteer projects you love
Embracing parenthood with minimum of 16 weeks of parental leave

Different people approach problems differently. We need that. Elastic is an equal opportunity employer and is committed to creating an inclusive culture that celebrates different perspectives, experiences, and backgrounds. Qualified applicants will receive consideration for employment without regard to race, ethnicity, color, religion, sex, pregnancy, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, disability status, or any other basis protected by federal, state or local law, ordinance or regulation.

We welcome individuals with disabilities and strive to create an accessible and inclusive experience for all individuals. To request an accommodation during the application or the recruiting process, please email candidate_accessibility@elastic.co. We will reply to your request within 24 business hours of submission.

Applicants have rights under Federal Employment Laws, view posters linked below: Family and Medical Leave Act (FMLA) Poster; Pay Transparency Nondiscrimination Provision Poster; Employee Polygraph Protection Act (EPPA) Poster and Know Your Rights (Poster)

Elasticsearch develops and distributes technology and information that is subject to U.S. and other countries’ export controls and licensing requirements for individuals who are located in or are nationals of the following sanctioned countries and regions: Belarus, Cuba, Iran, North Korea, Syria, or Russia, including the Ukrainian territories annexed by Russia (The Crimea region of Ukraine, The Donetsk People's Republic (DNR), The Luhansk People's Republic (LNR), Kherson or Zaporizhzhia). If you are located in or are a national of one of the listed countries or regions, an export license may be required as a condition of your employment in this role. Please note that national origin and/or nationality do not affect eligibility for employment with Elastic.

Please see here for our Privacy Statement.

What is the role

We're also a team that grows people as well as systems. If that challenge genuinely excites you, we'd love to hear from you.

What you will be doing

Engineering software to automate large-scale systems — building internal tools and services, not just running scripts.
Optimizing the reliability and lifecycle of hosts across multiple cloud providers.
Strengthening our observability posture — crafting alerting and monitoring systems that drive incident prevention over incident response.
Scaling global infrastructure and evolving the infrastructure management processes to meet growing demand.
Contributing to code reviews, sharing your work, planning what we need to do next, and both mentoring and being mentored by teammates.
Being part of a balanced SRE on-call rotation: responding to incidents, improving runbooks, participating in postmortems, and championing reliability improvements.

What you bring

Experience building software with Golang. You are also comfortable reviewing others' code and offering constructive feedback.
Production experience operating large-scale cloud compute (hundreds of hosts or more) via automated workflows.
Deep experience with Linux systems — you are at home in the terminal debugging at the OS level.
Proficiency working with containerized workloads in production.
A customer-first, systems-thinking approach to operational problems — you care about root causes, not just symptoms.
Comfortable working across time zones in both real-time and asynchronous contexts.
You contribute clear and maintainable documentation such as software designs, runbooks, architecture diagrams/decisions, postmortems, etc...
You communicate project status regularly and clearly, flag blockers early, and follow through on action items.
A sensible approach to AI integration — identifying where AI tools genuinely reduce operational burden and embedding them into workflows without adding complexity.

Bonus Points

Production experience with any of: Terraform, Puppet, Ansible, Argo CD, Argo Workflows, CUE, Docker, Kubernetes, Ubuntu, or Ubuntu Live Patch.
Experience being on-call during incidents and using observability tools (e.g. Elastic Stack, Graphite, Prometheus, Influx) to diagnose issues, quantify impact, and confirm mitigations.
Hands-on experience engineering solutions with the Elastic Stack.

Compensation for this role is in the form of base salary. This role does not have a variable compensation component.

The typical starting salary range for this role is:

$143,100—$175,000 USD

The typical starting salary range for this role in the select locations listed above is:

$143,100—$175,000 USD

Additional Information - We Take Care of Our People

We strive to have parity of benefits across regions and while regulations differ from place to place, we believe taking care of our people is the right thing to do.

Competitive pay based on the work you do here and not your previous salary
Health coverage for you and your family in many locations
Ability to craft your calendar with flexible locations and schedules for many roles
Generous number of vacation days each year
Increase your impact - We match up to $2000 (or local currency equivalent) for financial donations and service
Up to 40 hours each year to use toward volunteer projects you love
Embracing parenthood with minimum of 16 weeks of parental leave

Please see here for our Privacy Statement.

Site Reliability Engineer (hosted Infra) - Platform

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

What is the role

What you will be doing

What you bring

Bonus Points

Additional Information - We Take Care of Our People

What is the role

What you will be doing

What you bring

Bonus Points

Additional Information - We Take Care of Our People