Sr. Manager Sre (individual Contributor)

Capital One Capital One · Banking · Mexico City, Mexico

Senior Manager SRE (Individual Contributor) role focused on driving reliability transformation and pioneering AI-driven automation for payment networks. The role involves defining technical vision, leading cross-team initiatives, architecting automation for operational processes, and mentoring engineers. Key responsibilities include designing and building AI-driven solutions for alert classification, runbook generation, and automated remediation, as well as contributing to observability and settlement systems.

What you'd actually do

  1. Define and maintain a 12-18 month technical vision and roadmap for GPN SRE in Mexico City - decompose destination architecture into deliverable steps, sequence investments, and align execution across teams
  2. Drive reliability transformation across settlement, observability, and automation domains - establish SLOs, error budgets, severity frameworks, and operational standards that teams build against
  3. Pioneer AI and agentic automation approaches - design and build AI-driven solutions (using Claude Code, Copilot CLI, and LLM frameworks) for alert classification, runbook generation, automated remediation, and incident analysis; set patterns that other engineers extend
  4. Own the technical strategy for domain-specific knowledge ramp-up: identify which domain expertise requires deep engineering investment vs. documentation, and architect systems that reduce reliance on tribal knowledge
  5. Lead cross-team technical initiatives - drive observability platform convergence, standardize on COF tooling, and eliminate arbitrary uniqueness across towers

Skills

Required

  • Professional English fluency
  • Bachelor's degree
  • 8+ years of experience in SRE, production operations, or reliability engineering
  • Experience in DevOps Engineering
  • 8+ years of experience in at least one of the following: Java, Python, Go
  • 6+ years of experience with Cloud Native technologies (Amazon Web Services, Microsoft Azure, Google Cloud Platform)
  • 5+ years of experience with container orchestration services including Docker or Kubernetes
  • Experience with Shell or Bash scripting
  • 5+ years of Unix or Linux system administration experience

Nice to have

  • Experience developing automation solutions using agentic

What the JD emphasized

  • AI-driven approaches
  • AI and agentic automation
  • AI-driven solutions
  • AI engineering

Other signals

  • AI-driven automation
  • agentic automation
  • alert classification
  • runbook generation
  • automated remediation
  • incident analysis
  • LLM frameworks