Site Reliability Developer 3

Oracle Oracle · Enterprise · NOIDA, UTTAR PRADESH, India

This role focuses on Site Reliability Engineering for cloud-native healthcare services, emphasizing the use of AI and AIOps to enhance operations, automation, and system reliability. The engineer will be responsible for the availability, scalability, and efficiency of production services, troubleshooting complex issues, and partnering with development teams. While AI/ML is used to improve operations, the core craft is not building AI models themselves.

What you'd actually do

  1. Own the reliability, availability, performance, and operations of production services.
  2. Support cloud-native EHR platforms built with microservices, Kubernetes, and OCI.
  3. Improve monitoring, alerting, observability, and incident response.
  4. Use AI, automation, and AIOps to reduce manual work and improve system health.
  5. Build tools and scripts for deployment, monitoring, recovery, and operational tasks.

Skills

Required

  • Java
  • Python
  • Shell scripting
  • microservices
  • Kubernetes
  • cloud platforms
  • OCI
  • AWS
  • Azure
  • GCP
  • troubleshooting
  • debugging
  • monitoring
  • logging
  • alerting
  • observability tools
  • REST APIs
  • JSON
  • XML
  • SQL
  • secure data handling
  • automation
  • CI/CD
  • production deployment
  • customer-impacting issues
  • technical escalations

Nice to have

  • EHR platforms
  • healthcare platforms
  • HL7
  • FHIR
  • Oracle Health
  • New Millennium
  • Oracle Database
  • Kubernetes
  • OCI

What the JD emphasized

  • critical healthcare services
  • cloud-native EHR platforms
  • AI and AIOps
  • AI-driven operational automation
  • AI/AIOps for anomaly detection, alert correlation, and incident insights
  • self-healing and auto-remediation capabilities
  • applied AI into production operations