Senior Staff Site Reliability Engineer

Fivetran Fivetran · Data AI · Oakland, CA · Engineering Department

Fivetran is seeking a Senior Staff Site Reliability Engineer to ensure the performance, reliability, and stability of their data platform infrastructure. This role involves monitoring, incident response, automation, and collaboration with engineering teams to evolve systems and maintain high availability.

What you'd actually do

  1. Responsible for ongoing reliability and robustness of Fivetran’s production infrastructure by monitoring availability, capacity, and throughput.
  2. Evolve systems by adding reliability into our product roadmap
  3. Coordinate the re-prioritize or fix critical bugs for support or sales requirements as needed
  4. Make recommendations to production infrastructure by interfacing with engineering to ensure 100% availability
  5. Ensure scalable artifacts deployment to all environments by automation scripts

Skills

Required

  • managed Kubernetes (EKS, AKS, and GKE)
  • Cloud Platforms and related tooling: AWS, Azure, GCP, Terraform, Ansible, Buildkite, Pulumi, and ArgoCD
  • Python/Shell scripting
  • Linux operating systems, internals, and administration
  • cloud networking like VPNs, Privatelinks, and Private Service connect (GCP)
  • PostgreSQL

Nice to have

  • Java
  • GO

What the JD emphasized

  • 100% availability