Director, Software Engineering (site Reliability Engineering)

Affirm Affirm · Fintech · United States · Remote · Infrastructure Platform Eng

Director of Software Engineering (Site Reliability Engineering) at Affirm, a fintech company. This role focuses on building and maturing reliability practices, incident response, risk management, and operational lifecycle programs. The director will lead a global team of SREs, systems engineers, and full-stack engineers, partnering with various leaders to mitigate systemic risks and ensure high availability of core services. The role requires significant experience in SRE and team leadership, with a focus on operational excellence and risk management within a regulated environment.

What you'd actually do

  1. Set the vision and drive execution for Reliability Engineering at Affirm
  2. Own and coordinate delivery of high availability of core Affirm’s services, to attain our service level standards and expectations with external partners
  3. Iterate and maintain a best-in-industry global incident response & lifecycle program
  4. Build software and program management structure to perform continual risk management across the entire Affirm system and Engineering organization
  5. Run a robust development lifecycle establishing a culture for operational excellence, while experimenting and failing fast

Skills

Required

  • 15+ years of relevant experience in software and site reliability engineering
  • Experience leading SRE, systems engineering, and full stack engineering teams
  • Successful track record driving key outcomes that drive the company’s success
  • Comfortable partnering across disciplines and influencing across a wide variety of leaders
  • World-class communicator with excellent instincts for empathetic messaging
  • Keen technical mind comfortable reading and understanding full-stack code
  • Proven track record of establishing and growing teams, retaining talent, and comfort working with ambiguity

What the JD emphasized

  • partner closely with Product, Security, Enterprise Risk, Legal, Compliance, and Engineering leaders
  • mitigate systemic risks
  • high availability of core Affirm’s services
  • incident response
  • risk management
  • operational lifecycle programs
  • continual risk management
  • operational excellence