Senior Site Reliability Engineer

Calendly Calendly · Enterprise · Remote · Engineering

Calendly is seeking a Senior Site Reliability Engineer to design, build, maintain, and operate their next-generation infrastructure platform. This role involves building tools, deploying open-source solutions, and consuming cloud services, with a focus on enabling application engineering teams through monitoring best practices and optimal infrastructure use. Responsibilities include developing infrastructure tools, evaluating cloud-native solutions, implementing Infrastructure as Code, enhancing observability, participating in on-call rotations, and defining standard practices for services and incidents.

What you'd actually do

  1. Building tools and applications to extends Calendly’s infrastructure platform
  2. Evaluating and deploying cloud native open source tools
  3. Exercising expertise in cloud infrastructure concepts and patterns
  4. Instituting resilient infrastructure through Infrastructure as code
  5. Maintaining and improving observability of our infrastructure platform, and offers patterns for application teams to consume for application observability

Skills

Required

  • Linux operating system
  • Cloud infrastructure (GCP)
  • Distributed systems
  • Reliability practices
  • Designing, building, and running highly-available production infrastructure
  • Golang or Python development
  • Kubernetes (Controllers and Operators)
  • Computer networking principles
  • Cloud networking technologies
  • Software and infrastructure monitoring tools (Datadog)

Nice to have

  • Mentor others
  • Creative problem-solving
  • Communication of patterns and improvements
  • Internal customer collaboration

What the JD emphasized

  • Strong technical knowledge of cloud infrastructure (especially GCP), distributed systems, and reliability practices
  • Deep experience designing, building, and running highly-available production infrastructure
  • Strong Golang or Python development experience; especially writing APIs to build, orchestrate and manage cloud infrastructure
  • Solid working knowledge of patterns and principles for designing and implementing cloud native applications on Kubernetes, such as Controllers and Operators
  • Robust knowledge of computer networking principles and extensive experience with cloud networking technologies to create scalable and secure environments.
  • Extensive working experience with software and infrastructure monitoring tools (especially Datadog)