Sr Site Reliability Engineer, Customer Systems

Apple Apple · Big Tech · Austin, TX +1 · Software and Services

Site Reliability Engineer for Customer Systems at Apple, focusing on designing, building, and delivering highly scalable, reliable, and secure cloud infrastructure. The role involves automation, intelligent monitoring, and CI/CD pipeline management for customer-facing applications and services.

What you'd actually do

  1. Innovate, architect, build, and document highly available, scalable, reliable, secure Infrastructure
  2. Troubleshoot application specific, network, system & performance issues
  3. Build and maintain CI/CD infrastructure to enable fast delivery cycles for software engineering teams
  4. Envision and build automation tools to deliver infrastructure services reliably and in a repeatable fashion
  5. Collaborate with other site reliability engineers, software engineers, quality engineers, to gather, define, and analyze non-functional/technical requirements

Skills

Required

  • designing and building resilient, large-scale, low latency, cloud and on-prem Infrastructure including Compute, Storage, and Network
  • deploying/managing Kubernetes using Helm
  • Shell Scripting
  • Python
  • Ansible
  • monitoring using Splunk, Grafana, Prometheus, Alertmanager
  • networking protocols: DNS, TCP, HTTP/HTTPS
  • setting up and managing CI/CD pipelines
  • Bachelor's or Master's in Computer Science or equivalent experience

Nice to have

  • Cassandra
  • MongoDB
  • Couchbase databases
  • AWS S3 or similar storage technologies
  • deploying, monitoring and supporting java applications
  • ArgoCD and GitOps model
  • defining, monitoring and achieving key operational metrics like MTTR and SLO
  • GenAI tools in workflow automation for infrastructure management

What the JD emphasized

  • highly scalable, reliable, secure cloud infrastructure
  • low latency
  • Kubernetes
  • CI/CD pipelines
  • monitoring