Staff Site Reliability Engineer

Fivetran Fivetran · Data AI · Oakland, CA · Engineering Department

Fivetran is seeking a Staff Site Reliability Engineer to ensure the reliability and robustness of their data platform infrastructure. The role involves monitoring availability, capacity, and throughput, evolving systems with reliability in mind, coordinating bug fixes, making infrastructure recommendations, automating deployment, and monitoring/remedying infrastructure vulnerabilities. The engineer will work closely with various teams including engineering, product, support, and sales.

What you'd actually do

  1. Responsible for ongoing reliability and robustness of Fivetran’s production infrastructure by monitoring availability, capacity, and throughput.
  2. Evolve systems by adding reliability into our product roadmap
  3. Coordinate the re-prioritize or fix critical bugs for support or sales requirements as needed
  4. Make recommendations to production infrastructure by interfacing with engineering to ensure 100% availability
  5. Ensure scalable artifacts deployment to all environments by automation scripts

Skills

Required

  • managed Kubernetes (EKS, AKS, and GKE)
  • Cloud Platforms and related tooling: AWS, Azure, GCP, Terraform, Ansible, Buildkite, Pulumi, and ArgoCD
  • Python/Shell scripting
  • Linux operating systems, internals, and administration
  • cloud networking like VPNs, Privatelinks, and Private Service connect (GCP)
  • databases such as PostgreSQL

Nice to have

  • Java
  • GO

What the JD emphasized

  • 100% availability