Senior Platform Engineer I

Booking Booking · Hospitality · Bangalore, India

Senior Platform Engineer role focused on building, designing, and owning software applications and systems. Responsibilities include ensuring quality, refactoring code, evaluating architecture solutions, managing services end-to-end, resolving production issues, reducing operational costs through automation, improving monitoring and alerting, and providing architectural guidance and mentorship.

What you'd actually do

  1. Building software ApplicationsIs responsible to build software applications by using relevant development languages and applying knowledge of systems, services and tools appropriate for the business area and guide more junior members of the team in this topic.Is responsible to refactor and simplify code by introducing design patterns when necessary and guide more junior members of the team in this topic.Is responsible to ensure the quality of the application by following standard testing techniques and methods that adhere to the test strategyIs responsible to write readable and reusable code by applying standard patterns and using standard librariesIs responsible to maintain data security, integrity and quality by effectively following company standards and best practices
  2. Software Systems DesignIs responsible to evaluate possible architecture solutions by taking into account cost, business requirements, technology requirements and emerging technologiesIs responsible to describe the implications of changing an existing system or adding a new system to a specific area, by having a broad, high-level understanding of the infrastructure and architecture of our systemsIs responsible to help grow the business and/or accelerate software development by applying engineering techniques (e.g. prototyping, spiking and vendor evaluation) and standardsIs responsible to meet business needs by designing solutions that meet current requirements and are adaptable for future enhancements
  3. End to End System OwnershipIs responsible to own a service end to end by actively monitoring application health and performance, setting and monitoring relevant metrics and act accordingly when violatedIs responsible to reduce business continuity risks and bus factor by applying state-of-the-art practices and tools, and writing the appropriate documentation such as runbooks and OpDocsIs responsible to reduce risk and obtain customer feedback by using continuous delivery and experimentation frameworksIs responsible to independently manage an application or service by working through deployment and operations in production and guide more junior members of the team in this topic.Is responsible to maintain data security, integrity and quality by effectively following company standards and best practises
  4. Technical Incident ManagementIs responsible to address and resolve live production issues by mitigating the customer impact within SLAIs responsible to improve the overall reliability of systems by producing long term solutions through root cause analysisIs responsible to keep track of incidents by contributing to postmortem processes and logging live issues
  5. Automation and toil reductionIs responsible to ensure that infrastructure stays current by reducing technical debt, searching for bottlenecks and preparing for scalingIs responsible to reduce cost of operations and maintenance by leveraging new technologies, automation, and partner with vendors to ensure we stay currentIs responsible to reduce human labour by writing small software features that address availability, scalability, latency and efficiency

Skills

Required

  • software development languages
  • systems, services and tools
  • design patterns
  • standard testing techniques
  • readable and reusable code
  • data security, integrity and quality
  • architecture solutions
  • business requirements
  • technology requirements
  • emerging technologies
  • infrastructure and architecture
  • engineering techniques
  • prototyping
  • spiking
  • vendor evaluation
  • application health and performance monitoring
  • metrics
  • business continuity
  • documentation
  • runbooks
  • OpDocs
  • continuous delivery
  • experimentation frameworks
  • deployment and operations
  • production
  • live production issues
  • SLA
  • root cause analysis
  • postmortem processes
  • technical debt
  • scaling
  • cost of operations and maintenance
  • automation
  • availability
  • scalability
  • latency
  • efficiency
  • performance of production systems
  • network infrastructure
  • observability metrics
  • business KPIs
  • capacity planning
  • critical thinking
  • analytical thinking
  • process improvement
  • system improvements
  • structural improvements
  • performance gains
  • clear, well-structured, and meaningful communication
  • adaptable communication
  • active listening
  • technical solution design
  • functional, nonfunctional & architectural requirements
  • coaching
  • mentoring