What you'd actually do

Experience architecting and implementing large-scale Observability platforms

Experience with internally hosted logging systems like Splunk, ClickHouse, Loki, Elastic, assisting clients and improving environment performance and stability

Demonstrated ability to drive ingestion cost optimization through data-driven analysis, pipeline guardrails, and direct engagement with customer engineering teams to reduce unnecessary log volume

Experience with OpenTelemetry — including collector configuration, pipelines, and instrumentation — as a core requirement given Adobe's OTel-native observability strategy

AI agent development and experience integrating AI workflows into large-scale deployments; ability to build AI-assisted workflows to surface actionable insights from large log datasets and automate routine user interactions

Skills

Required

production level experience with distributed applications at scale in public and/or private cloud
architecting and implementing large-scale Observability platforms
internally hosted logging systems like Splunk, ClickHouse, Loki, Elastic
ingestion cost optimization
OpenTelemetry
AI agent development
integrating AI workflows into large-scale deployments
build AI-assisted workflows
surface actionable insights from large log datasets
automate routine user interactions
architecting distributed environments with thousands of users
Go
Python
building integrations and applications to large-scale Observability environments
designing and implementing systems for fault tolerance, scalability and stability
developing, deploying and running distributed applications on cloud platforms
container and orchestration technologies (Docker, Kubernetes)
on-call coverage
triage and resolve issues across platforms
highest level of up-time and Quality of Service (QoS)
defining service level objectives (SLOs) and service level indicators (SLIs)
cloud deployments
collaborate with SRE and Engineering/Product teams
designing and maintaining production monitoring systems
solving performance and stability issues
Excellent communicator

Nice to have

evaluating and prototyping alternative storage/processing backends (e.g., ClickHouse, Loki)
Grafana
Cortex
Tempo
DevOps/SRE approach

Join a globally diverse team that both builds and finds best-of-breed tools to bring critical Observability services to all of Adobe. Our team embodies DevOps, as our responsibilities range from crafting new tools and UIs to maintaining and supporting one of the largest logging deployments in the industry, in partnership with other observability tools.

We’re a close-knit team dedicated to providing a robust platform, supporting both Adobe’s engineering teams and each other. We are looking for a DevOps Engineer to assist in building Adobe’s observability strategy.

If you enjoy solving complex tasks where it’s easy to draw a line from your efforts to real accomplishments, come talk to us.

Job Requirements

7 to 10+ years production level experience with distributed applications at scale in public and/or private cloud
Experience architecting and implementing large-scale Observability platforms

Must Have

Experience with internally hosted logging systems like Splunk, ClickHouse, Loki, Elastic, assisting clients and improving environment performance and stability
Demonstrated ability to drive ingestion cost optimization through data-driven analysis, pipeline guardrails, and direct engagement with customer engineering teams to reduce unnecessary log volume
Experience with OpenTelemetry — including collector configuration, pipelines, and instrumentation — as a core requirement given Adobe's OTel-native observability strategy
AI agent development and experience integrating AI workflows into large-scale deployments; ability to build AI-assisted workflows to surface actionable insights from large log datasets and automate routine user interactions
Experience architecting distributed environments with thousands of users
Programming experience with languages like Go, Python; experience building integrations and applications to large-scale Observability environments
Experience designing and implementing systems for fault tolerance, scalability and stability
Experience developing, deploying and running distributed applications on cloud platforms; experience with container and orchestration technologies (Docker, Kubernetes)
Comfortable owning on-call coverage across a multi-tool observability stack, with the ability to triage and resolve issues across platforms beyond primary area of expertise
Ensure the highest level of up-time and Quality of Service (QoS) to Adobe's customers through operational excellence
Knowledge in defining service level objectives (SLOs) and service level indicators (SLIs) to represent and measure service quality
Knowledge of (public and/or private) cloud deployments
Collaborate with SRE and Engineering/Product teams in driving critical initiatives
Experience in designing and maintaining production monitoring systems
Experience in solving performance and stability issues using a wide variety of tools
Excellent communicator in and across teams, driving projects to completion
Impacts the organization through contribution to technical direction and strategic decisions

Good to Have

Experience evaluating and prototyping alternative storage/processing backends (e.g., ClickHouse, Loki)
Experience with other Observability tooling like Grafana, Cortex, and Tempo
Promote the DevOps/SRE approach

About Adobe

Adobe empowers everyone to create through innovative platforms and tools that unleash creativity, productivity and personalized customer experiences. Adobe’s industry-leading offerings including Adobe Acrobat Studio, Adobe Express, Adobe Firefly, Creative Cloud, Adobe Experience Platform, Adobe Experience Manager, and GenStudio enable people and businesses to turn ideas into impact, powered by AI and driven by human ingenuity.

Our 30,000+ employees worldwide are creating the future and raising the bar as we drive the next decade of growth. We’re on a mission to hire the very best and believe in creating a company culture where all employees are empowered to make an impact. At Adobe, we believe that great ideas can come from anywhere in the organization. The next big idea could be yours.

** Let’s Adobe together**

At Adobe, we believe in creating a company culture where all employees are empowered to make an impact. Learn more about Adobe life, including our values and culture, focus on people, purpose and community, Adobe for All, comprehensive benefits programs, the stories we tell, the customers we serve, and how you can help us advance our mission of empowering everyone to create.

Adobe is proud to be an Equal Employment Opportunity employer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other protected characteristic. Learn more.

Adobe aims to make our Careers website and recruiting process accessible to any and all users. If you have a disability or special need that requires accommodation to navigate our website or complete the application process, email accommodations@adobe.com or call +1 408-536-3015.

AI Use Guidelines for Interviews: Our interviews are designed to reflect your own skills and thinking. The use of AI or recording tools during live interviews is not permitted unless explicitly invited by the interviewer or approved in advance as part of a reasonable accommodation. If these tools are used inappropriately or in a way that misrepresents your work, your application may not move forward in the process.

At Adobe, we empower employees to innovate with AI — and we look for candidates eager to do the same. As part of the hiring experience, we provide clear guidance on where AI is encouraged during the process and where it’s restricted during live interviews. See how we think about AI in the hiring experience.

Must Have

Experience with internally hosted logging systems like Splunk, ClickHouse, Loki, Elastic, assisting clients and improving environment performance and stability

Demonstrated ability to drive ingestion cost optimization through data-driven analysis, pipeline guardrails, and direct engagement with customer engineering teams to reduce unnecessary log volume

Experience with OpenTelemetry — including collector configuration, pipelines, and instrumentation — as a core requirement given Adobe's OTel-native observability strategy

Experience architecting distributed environments with thousands of users

Programming experience with languages like Go, Python; experience building integrations and applications to large-scale Observability environments

Experience designing and implementing systems for fault tolerance, scalability and stability

Experience developing, deploying and running distributed applications on cloud platforms; experience with container and orchestration technologies (Docker, Kubernetes)

Comfortable owning on-call coverage across a multi-tool observability stack, with the ability to triage and resolve issues across platforms beyond primary area of expertise

Ensure the highest level of up-time and Quality of Service (QoS) to Adobe's customers through operational excellence

Knowledge in defining service level objectives (SLOs) and service level indicators (SLIs) to represent and measure service quality

Knowledge of (public and/or private) cloud deployments

Collaborate with SRE and Engineering/Product teams in driving critical initiatives

Experience in designing and maintaining production monitoring systems

Experience in solving performance and stability issues using a wide variety of tools

Excellent communicator in and across teams, driving projects to completion

Impacts the organization through contribution to technical direction and strategic decisions

Good to Have

Experience evaluating and prototyping alternative storage/processing backends (e.g., ClickHouse, Loki)

Experience with other Observability tooling like Grafana, Cortex, and Tempo

Promote the DevOps/SRE approach

About Adobe

** Let’s Adobe together**

Devops Engineer - Observability

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Job Requirements

Must Have

Good to Have

Job Requirements

Must Have

Good to Have