Senior Site Reliability Engineer II

Braze Braze · Enterprise · New York, NY · Engineering

Site Reliability Engineers (SREs) are responsible for keeping all internal-facing services and platforms running smoothly, ensuring site uptime. They blend system administrators and software engineers, applying engineering principles, operational discipline, and automation to infrastructure services. The role involves improving automation, infrastructure reliability, and empowering other engineering teams. Braze operates at a massive scale with a diverse technology stack including Ruby on Rails, MongoDB, Redis, Kafka, and Kubernetes. The Senior SRE will collaborate to improve infrastructure, automation, and tooling.

What you'd actually do

  1. Partner with Braze’s engineering teams on:
  2. Develop Braze’s internal platform infrastructure:
  3. Manage incidents:

Skills

Required

  • Linux and Unix Shell
  • Ruby
  • Go
  • Docker
  • Kubernetes
  • Terraform
  • MongoDB
  • Redis
  • Kafka
  • Postgres

Nice to have

  • systems thinking
  • automation
  • incident management
  • distributed systems
  • networking
  • IaC technologies

What the JD emphasized

  • 5+ years of experience as a Software, DevOps, or Site Reliability Engineer
  • strong programming skills - Ruby and/or Go preferred
  • experience with Docker, Kubernetes, Terraform, or similar IaC technologies
  • experience with MongoDB, Redis, Kafka, Postgres, or similar data technologies