Site Reliability Developer 3

Oracle Oracle · Enterprise · HYDERABAD, TELANGANA, India

This role focuses on Site Reliability Engineering (SRE) for Oracle's cloud services, emphasizing infrastructure, automation, availability, scalability, efficiency, and performance of large-scale distributed systems. The developer will be responsible for the full stack ownership of services, capacity planning, performance analysis, and ensuring security and resiliency. They will partner with development teams to improve service architecture and act as an escalation point for critical issues.

What you'd actually do

  1. Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence.
  2. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services.
  3. Design and develop designs, architectures, standards, and methods for large-scale distributed systems.
  4. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.
  5. Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas.

Skills

Required

  • distributed systems
  • automation
  • scalability
  • performance
  • reliability
  • cloud services
  • infrastructure

Nice to have

  • SRE principles
  • capacity planning
  • system tuning
  • security
  • resiliency