Senior Software Engineer, Reliability Engineering

Airbnb Airbnb · Consumer · São Paulo, Brazil · Software Engineering

Senior Software Engineer for Airbnb's Site Reliability Engineering team in São Paulo, Brazil. Focuses on developing and maintaining tools and systems for service reliability, monitoring, and incident management at scale. Responsibilities include incident response, leading high-severity incidents as Incident Commander, and collaborating with engineering teams to ensure service reliability and operability.

What you'd actually do

  1. Design, implement and maintain the tools and systems that support service reliability, monitoring, and alerting.
  2. Collaborate with other engineering teams to ensure services are designed with reliability in mind, and provide guidance on the appropriate use of tooling and automation.
  3. Identify opportunities to improve the reliability, scalability, and efficiency of our services and drive their implementation.
  4. Work with infrastructure engineers to understand the challenges they face in operating our services and develop tools and systems to help them manage these challenges.
  5. Participate in incident response and post-mortems to identify and address systemic issues.

Skills

Required

  • Bachelor's degree in Computer Science or related field.
  • 5+ years of experience in software engineering or SRE roles, with a focus on large scale distributed systems.
  • Strong coding skills in at least one programming language, such as Java, Python, or Go.
  • Experience with distributed systems and service-oriented architectures.
  • Experience with cloud computing platforms such as AWS or Google Cloud Platform.
  • Strong conviction in software development best practices, including version control, automated testing, and continuous integration and delivery.
  • Experience with containerization technologies such as Docker and Kubernetes.
  • Excellent problem-solving and analytical skills, with a strong attention to detail.
  • Ability to work effectively in a fast-paced and dynamic environment.
  • Strong communication and interpersonal skills.
  • Fluent in English (Professional Level)

What the JD emphasized

  • Senior Software Engineer
  • Site Reliability Engineering
  • incident management
  • high severity incidents
  • Incident Commander