Sre Incident Response Lead

Zendesk Zendesk · Enterprise · Krakow, Poland

Zendesk is seeking an Incident Response Lead to manage and coordinate incident response for their internal IT services. This role involves leading triage, restoration, swarming, and post-incident reviews for critical SaaS applications, aiming to reduce MTTR and improve overall reliability.

What you'd actually do

  1. Lead incident triage and restoration—declare incidents, assess severity, and coordinate response across resolver teams.
  2. Drive incident swarming—assemble dynamic response teams; coordinate with domain owners, platform teams, and vendors to restore service.
  3. Serve as Incident Commander for major (Sev 0–2) IT incidents; own communication cadence and status updates to stakeholders.
  4. Facilitate post-incident reviews (PIRs) and blameless postmortems; ensure action items are captured and handed off to Problem Management.
  5. Partner with Observability & Monitoring to improve detection, alert hygiene, and incident creation from monitoring tools (e.g., Datadog).

Skills

Required

  • Incident Management
  • SRE
  • IT Operations
  • ITIL Incident Management
  • incident swarming
  • high-severity incident response
  • SaaS applications
  • communication
  • facilitation

Nice to have

  • global, high-growth tech company experience
  • SaaS applications experience
  • incident tools (Incident.I/O, PagerDuty, Revere)
  • collaboration channels (Slack, Zoom)
  • observability, monitoring, or alerting (Datadog, Prometheus)
  • severity models
  • escalation paths
  • post-incident review best practices
  • ITIL 4 Foundation or Incident Management certification
  • automation/scripting (Python, PowerShell)
  • SRE principles
  • error budgets
  • blameless postmortem culture
  • customer support tech stack environment

What the JD emphasized

  • 5+ years of experience in Incident Management, SRE, or IT Operations roles
  • Experience leading high-severity incident response
  • calm under pressure