Sre Engineering Manager - Pxe Erm

Manager for Site Reliability Engineering (SRE) with a focus on platform operations, stability, resilience, and automated optimizations. The role requires hands-on engineering, technical leadership, and collaboration with cross-functional teams to deliver and maintain high-quality software solutions with modern observability practices and SLAs. Emphasizes modern SRE practices, DevSecOps, CI/CD, and incremental delivery.

What you'd actually do

  1. Owns and drives cost-effective solutions for platform operations, focusing on stability, resilience, and automated optimizations to reduce maintenance costs and meet business KPIs.
  2. Champions modern SRE practices, ensuring alignment with business goals and operational standards. Responsible for requirement analysis, automation, integration, monitoring, and ongoing support.
  3. Hands-on contributor who maintains system integrity, solves complex problems through automation and predictive analytics, engineers modern solutions, and drives operational excellence.
  4. Develop lean engineering solutions through rapid, inexpensive experimentation to solve customer needs. Engage with customers and product teams to deliver the right solution for the product in the right way at the right time, quickly responds to incidents, and engages with stakeholders to deliver timely operational fixes.
  5. Exhibit a mindset that favors action and evidence over extensive planning. Utilize a leaning-forward approach to navigate complexity and uncertainty, delivering lean, supportable, and maintainable solutions.

Skills

Required

  • SRE
  • platform operations
  • stability
  • resilience
  • automation
  • cost optimization
  • business KPIs
  • requirement analysis
  • integration
  • monitoring
  • system integrity
  • predictive analytics
  • Agile
  • DevSecOps
  • CI/CD
  • Observability
  • Blue-Green deployment
  • Canary deployment
  • A/B testing
  • domain-specific knowledge
  • metrics
  • root-cause analysis
  • communication skills

What the JD emphasized

  • high-quality modern observability practices
  • SLAs
  • modern SRE practices
  • DevSecOps
  • CI/CD
  • Observability