Site Reliability Engineer II

Booking Booking · Hospitality · Amsterdam, Netherlands · Engineering

Site Reliability Engineer II at Booking.com, focusing on improving the stability, scalability, availability, and latency of e-commerce products through software solutions and automation. Responsibilities include designing and implementing systems, owning services, solving production issues, building monitoring and capacity tests, and advocating for engineering best practices.

What you'd actually do

  1. Design, develop and implement systems software that improves the stability, scalability, availability and latency of the Booking.com products;
  2. Take ownership of one or more services and have the freedom to do what is best for our business and customers;
  3. Solve problems occurring with our highly available production systems and build solutions and automation to prevent them from happening again;
  4. Build effective monitoring to monitor the health of your system, and jump in to handle outages;
  5. Build and run capacity tests to handle the growth of your systems;

Skills

Required

  • Proven experience in solving algorithmic problems in at least one backend programming language (Java, Python, GO, NodeJS ect.)
  • Experience with building, operating and maintaining scalable distributed systems, and with operations automation;
  • Experience with Infrastructure as Code technologies;
  • Knowledge of cloud computing fundamentals;
  • Validated foundation in Linux administration and troubleshooting;
  • Understanding of Service level agreements and objectives (SLA, SLO);

Nice to have

  • Experience in Open Source Infrastructure Management/Orchestration tools like Kubernetes, Openstack, etc.
  • Experience in Monitoring / observability technologies like Prometheus, Graphite, Grafana, Kibana, Elasticsearch;
  • Experience in Networking, Security or Storage is a plus;