Database Systems Sre, Ase Cassandra Sre

Apple Apple · Big Tech · Seattle, WA · Software and Services

This role is for a Database Systems SRE focusing on Apache Cassandra for Apple's internet services. Responsibilities include developing and managing software for Cassandra deployment, maintenance automation, backup services, monitoring, and contributing to upstream patches. The role requires expertise in SRE concepts, distributed databases, performance engineering, and operating systems, with a focus on reliability, scalability, and speed at massive scale.

What you'd actually do

  1. Understanding of core SRE concepts - Monitoring, Alerting, Incident management.
  2. Understanding of database concepts (consistency models, isolation levels, crash and recovery semantics).
  3. Performance engineering (design concepts, profile-guided optimization).
  4. Service management across a bare metal, virtualized (EC2), and containerized (K8s) style platforms.
  5. Fundamentals of system-level hardware and networking components (storage devices and controllers, network interfaces, CPU and memory layout in server-class systems).

Skills

Required

  • BS or MS in Computer Science / related fields or equivalent work experience
  • Support of internet-facing production services and distributed systems via deployments, On Call and Incident Management.
  • Experience running large scale infrastructure with a heavy reliance on automation tooling
  • Excellent troubleshooting and performance deep dive analysis
  • Real operational experience managing services at scale on Kubernetes
  • Proficient in one or more of the following programming languages: Java, Go (golang), Python
  • Operational experience deploying in and running on Datacenter and Cloud architectures (networking topologies, host placement strategies, and failure modes); design of multi-datacenter systems; failure domains; and wide-area networking.

Nice to have

  • experience in this area is a plus
  • Prior experience with development or maintenance of distributed databases / storage systems is recommended.

What the JD emphasized

  • critical internet services
  • massive scale
  • hundreds of petabytes of data
  • internet-facing production services
  • large scale infrastructure
  • Datacenter and Cloud architectures