Director, Software Engineering (site Reliability Engineering)

Affirm Affirm · Fintech · United States · Remote · Infrastructure Platform Eng

Director of Software Engineering (Site Reliability Engineering) at Affirm, a fintech company. This role focuses on owning and driving execution for reliability, availability, and operational excellence across Affirm’s global platform. Responsibilities include setting vision, coordinating delivery of high availability for core services, iterating on incident response and lifecycle programs, performing continual risk management, and building/leading a global team of SREs, systems engineers, and full-stack engineers. The role requires significant experience in SRE and team leadership, with a focus on partnering across disciplines and ensuring operational excellence.

What you'd actually do

  1. Set the vision and drive execution for Reliability Engineering at Affirm
  2. Own and coordinate delivery of high availability of core Affirm’s services, to attain our service level standards and expectations with external partners
  3. Iterate and maintain a best-in-industry global incident response & lifecycle program
  4. Build software and program management structure to perform continual risk management across the entire Affirm system and Engineering organization
  5. Run a robust development lifecycle establishing a culture for operational excellence, while experimenting and failing fast

Skills

Required

  • Software Engineering
  • Site Reliability Engineering
  • Team Leadership
  • Incident Response
  • Risk Management
  • Program Management
  • Operational Excellence
  • Cross-functional Collaboration
  • Communication

Nice to have

  • Full-stack code understanding

What the JD emphasized

  • own execution for reliability, availability, and operational excellence
  • ensure that our core services consistently meet high availability and performance expectations
  • partner closely with Product, Security, Enterprise Risk, Legal, Compliance, and Engineering leaders to proactively identify and mitigate systemic risks
  • balance hands-on technical depth with strategic leadership
  • 15+ years of relevant experience in software and site reliability engineering
  • Experience leading SRE, systems engineering, and full stack engineering teams
  • Successful track record driving key outcomes that drive the company’s success
  • Comfortable partnering across disciplines and influencing across a wide variety of leaders
  • Keen technical mind comfortable reading and understanding full-stack code
  • Proven track record of establishing and growing teams, retaining talent, and comfort working with ambiguity