Site Reliability Engineer - Oci Data Services

Oracle Oracle · Enterprise · Czech Republic

This role is for a Site Reliability Engineer focused on operating and scaling Oracle's cloud-native data movement services (GoldenGate and Database Migration Service). The engineer will be responsible for production health, reliability, and continuous improvement, including incident response, automation, observability, and capacity planning. The role involves partnering with software engineering teams to enhance service architecture and reliability, and supporting services across commercial and sovereign cloud environments. Experience with public cloud platforms, automation, scripting, and observability tools is preferred.

What you'd actually do

  1. Operate and support production cloud services that power critical customer migration and data replication workloads.
  2. Monitor service health, investigate incidents, and lead troubleshooting efforts for complex production issues.
  3. Participate in incident response, root cause analysis, and post-incident reviews while driving corrective and preventative actions.
  4. Act as an escalation point for critical service issues and customer-impacting events.
  5. Partner with software engineering teams to improve service architecture, reliability, scalability, and operational readiness.

Skills

Required

  • Experience operating and supporting production cloud services.
  • Strong troubleshooting, incident management, and root cause analysis skills.
  • Experience with at least one major public cloud platform (OCI, AWS, Azure, GCP).
  • Experience with automation and Infrastructure-as-Code (Terraform or similar).
  • Scripting experience (Python, Shell, Go, or similar).
  • Experience with monitoring, alerting, logging, and observability tools.
  • Understanding of Linux, networking fundamentals, and distributed systems.
  • Ability to work cross-functionally with engineering and infrastructure teams.
  • Willingness to participate in on-call rotations.

Nice to have

  • Experience with Oracle Cloud Infrastructure (OCI) services and cloud operations.
  • Familiarity with data replication technologies such as Oracle GoldenGate or other large-scale data integration platforms.
  • Experience with Kubernetes, containerized workloads, and cloud-native architectures.
  • Knowledge of high-availability architectures, disaster recovery strategies, and service resilience best practices.
  • Experience supporting regulated or sovereign cloud environments.