Principal Software Development Engineer - Observability

Expedia Expedia · Hospitality · CA

Principal Software Engineer focused on building and operating a centralized, scalable, and cost-effective observability platform for a large engineering organization. This involves architecting telemetry pipelines for logs, metrics, and traces, driving OpenTelemetry adoption, implementing platform governance, and automating infrastructure lifecycle management. The role requires strong technical leadership, mentorship, and production debugging skills in a cloud-native environment.

What you'd actually do

  1. Architect and Build Core Telemetry Pipelines
  2. Drive OpenTelemetry Adoption
  3. Implement Platform Governance and Optimization
  4. Elevate the Practice of Observability
  5. Automate Infrastructure Lifecycle

Skills

Required

  • observability principles (logs, metrics, traces)
  • Prometheus
  • Grafana
  • Datadog
  • Splunk
  • OpenTelemetry
  • Go
  • Java
  • Python
  • Kubernetes
  • Docker
  • microservices
  • AWS

Nice to have

  • designing, building, and operating highly available, scalable, and resilient platforms
  • Terraform
  • Crossplane
  • Clickhouse
  • mentoring senior engineers
  • establishing standards for operational excellence and code quality

What the JD emphasized

  • 10x increase in data volume
  • thousands of services
  • final escalation point for complex, cross-cutting production incidents