Senior Site Reliability Engineer

Oracle Oracle · Enterprise · India

Senior Site Reliability Engineer for Oracle's Autonomous AI Database Team, responsible for building and maintaining the cloud service framework that powers autonomous database services. Focuses on automation, scaling, management, architecture, production operations, capacity planning, performance, deployment, and release engineering on Oracle Cloud Infrastructure.

What you'd actually do

  1. Take ownership of the implementation and production operations of a wide array of core system platform solutions
  2. React to production deficiencies by continuously implementing automation, self-healing, and real-time monitoring to production systems
  3. Be a strong contributor to development of platform services including architecture, provisioning, configuration, deployment, and support
  4. Partner with the distributed team in prototyping new database platform services
  5. Stay informed of cloud infrastructure stacks

Skills

Required

  • Oracle Engineered Systems and subsystems
  • deployment
  • optimization
  • maintenance
  • diagnose and resolve complex hardware/software issues
  • restore system functionality
  • root cause analysis
  • proactive mitigation strategies
  • Oracle Database technologies
  • RAC
  • Data Guard
  • ASM
  • RMAN
  • performance tuning
  • Linux platforms
  • system administration
  • maintenance
  • issue resolution
  • automation
  • Python
  • shell scripting
  • security best practices for web application delivery
  • network infrastructure concepts
  • configuration management tools

Nice to have

  • cloud platforms (OCI, AWS, Azure, GCP)
  • communication skills
  • analytical and problem-solving abilities

What the JD emphasized

  • production operations
  • automation
  • self-healing
  • real-time monitoring
  • architecture
  • provisioning
  • configuration
  • deployment
  • support
  • cloud infrastructure stacks
  • Extensive expertise in Oracle Engineered Systems and subsystems with hands-on experience in deployment, optimization, and maintenance.
  • Proven ability to diagnose and resolve complex hardware/software issues, restore system functionality, conduct root cause analysis, and implement proactive mitigation strategies.
  • Advanced proficiency in Oracle Database technologies, including RAC, Data Guard, ASM, RMAN, and performance tuning, with a focus on implementation and troubleshooting.
  • Operational mastery of Linux platforms (e.g., RHEL, OEL), including system administration, maintenance, and issue resolution.
  • Strong experience in automation using Python and shell scripting to develop custom solutions, streamline processes, and improve system efficiency.