Site Reliability Developer 4

Oracle Oracle · Enterprise · Japan

This role is for a Site Reliability Developer focused on ensuring the availability, scalability, and operational excellence of OCI's Japan Sovereign Cloud services. Responsibilities include designing and implementing automation, driving reliability improvements, leading incident investigations, and partnering with development teams. The role requires strong software engineering skills, cloud operations experience, and participation in a 24x7 shift rotation.

What you'd actually do

  1. design and implement automation
  2. drive service reliability improvements
  3. lead complex incident investigations
  4. partner with development teams to improve operational readiness
  5. own and prioritize an SRD operational improvement backlog

Skills

Required

  • Site Reliability Engineering
  • Software Engineering
  • Cloud Infrastructure
  • DevOps
  • Java
  • Python
  • Go
  • C++
  • cloud platforms
  • infrastructure automation
  • observability
  • monitoring
  • incident response
  • Linux systems administration
  • networking
  • storage
  • performance optimization
  • troubleshoot complex cross-functional production issues
  • root cause analysis
  • 24x7 shift rotation
  • alert quality improvement
  • operational documentation

Nice to have

  • Native-level Japanese language proficiency
  • business-level English communication skills