Software Developer 5

Oracle Oracle · Enterprise · Seattle, WA +1

This role focuses on building and operating Oracle Cloud Infrastructure's observability platform, which handles massive scale telemetry data (metrics, logs, traces) for internal services and customers. It involves designing and optimizing high-throughput ingestion, large-scale data processing, storage, and low-latency query systems for distributed environments.

What you'd actually do

  1. Lead the design, development, and operation of cloud-scale observability platforms supporting metrics, logs, traces, and related telemetry data.
  2. Architect and implement highly scalable, resilient, and cost-efficient telemetry collection, ingestion, processing, storage, and query systems.
  3. Drive the evolution of end-to-end observability pipelines, from instrumentation and data collection through real-time analytics and long-term retention.
  4. Design and optimize distributed systems capable of ingesting and processing massive volumes of telemetry data with stringent latency and availability requirements.
  5. Build and enhance query, search, and retrieval services that deliver fast, reliable, and intuitive access to observability data.

Skills

Required

  • design, development, and operation of cloud-scale observability platforms
  • telemetry collection, ingestion, processing, storage, and query systems
  • distributed systems design and optimization
  • high-throughput telemetry ingestion
  • large-scale data processing
  • cost-efficient storage
  • low-latency query execution
  • multi-tenant reliability
  • operational excellence
  • cloud-native observability platforms
  • metrics, logs, traces, and related telemetry data
  • instrumentation and data collection
  • real-time analytics
  • long-term retention
  • high-cardinality metrics
  • large-scale log analytics
  • distributed tracing workloads
  • query, search, and retrieval services
  • performance bottleneck identification and resolution
  • reliability, fault tolerance, scalability, security
  • technical strategy and architectural decisions
  • troubleshooting and root-cause analysis
  • emerging trends, technologies, and best practices in observability, distributed systems, data processing, and cloud-native architectures

Nice to have

  • Mentor senior and junior engineers
  • provide technical leadership
  • foster engineering best practices
  • Partner with service teams to improve instrumentation, telemetry quality, and operational visibility
  • Establish and monitor key service health, scalability, performance, and cost-efficiency metrics

What the JD emphasized

  • massive scale
  • massive volumes of telemetry data
  • large-scale data processing
  • large-scale log analytics
  • hyperscale cloud environments