Sr Site Reliability Engineer (ph)

Disney Disney · Media · Bay Lake, FL +1

This role is for a Senior Site Reliability Engineer responsible for designing, building, and supporting a global platform for managing IoT Devices, Sensors, Toys, and Media. The role involves driving a DevOps culture, automating infrastructure, engineering high reliability, managing CI/CD pipelines for Java and Angular applications, and ensuring security and stability of systems running on Linux, Docker, AWS, and Kubernetes. The engineer will also be involved in incident response and root cause analysis.

What you'd actually do

  1. Drive a DevOps culture among peers and developers
  2. Design, build, and support of the Connected Products platform
  3. Consult, design, build, and support development pipelines, automate infrastructure and operations, create telemetry for monitoring, engineer high reliability and reinforce best- practices to secure company data
  4. Perform systems administration in Linux Containers and Kubernetes Clusters and bring knowledge on systems, network, operational excellence and application stability, security, performance, and capacity management, operational excellence and application stability, security, performance, and capacity management, as well as documentation
  5. Sustain and manage the CICD pipelines of multiple Java-based applications and Angular single page applications

Skills

Required

  • Minimum 5 years of related work experience
  • Proficient in agile environments
  • Applied understanding of observability principles using relevant tools
  • Familiarity with commercial and open-source monitoring platforms
  • Hands-on experience with CI tools like Gitlab, AWS CodeBuild, Harness and Rancher
  • Proficient in configuration management tools: Terraform and Helm
  • Experience in object-oriented programming languages such as Java
  • Experience in procedural programming languages such as Python
  • Experience with MongoDB
  • Skilled in Cloud environments AWS and On-Prem Kubernetes
  • Collaborative in building reliable, scalable enterprise systems
  • Capable of identifying root causes in large-scale distributed systems
  • Proficient in Linux administration, troubleshooting, and security
  • Leading technical projects and ensuring smooth delivery
  • Collaborative work with Security Operations teams for secure solutions
  • Strong troubleshooting skills across systems, network, and code
  • Deep expertise and knowledge in the field

Nice to have

  • Experience with Incident Response
  • Experience working on cross-team projects
  • An ability to work both independently and collaboratively
  • Strong communication skills and a desire to share and learn

What the JD emphasized

  • extensive experience with web technologies
  • extensive experience with source control management using Github, Gitlab, and Helm
  • extensive experience with systems, network, operational excellence and application stability, security, performance, and capacity management
  • extensive experience with documentation
  • extensive experience with CICD pipelines
  • extensive experience with Java-based applications
  • extensive experience with Angular single page applications
  • extensive experience with incident response
  • extensive experience with root cause analysis
  • extensive experience with defect remediation
  • extensive experience with Linux administration
  • extensive experience with Docker Containers
  • extensive experience with AWS Deployments
  • extensive experience with Kubernetes Clusters
  • extensive experience with Terraform
  • extensive experience with Helm
  • extensive experience with MongoDB
  • extensive experience with AWS
  • extensive experience with On-Prem Kubernetes
  • extensive experience with object-oriented programming languages such as Java
  • extensive experience with procedural programming languages such as Python
  • extensive experience with Cloud environments AWS and On-Prem Kubernetes
  • extensive experience with identifying root causes in large-scale distributed systems
  • extensive experience with Linux administration, troubleshooting, and security
  • extensive experience with leading technical projects and ensuring smooth delivery
  • extensive experience with collaborative work with Security Operations teams for secure solutions
  • extensive experience with strong troubleshooting skills across systems, network, and code
  • extensive experience with deep expertise and knowledge in the field