Cloud Support Engineering Manager (sre Manager)

F5 F5 · Enterprise · Field-CO

Lead a global team of Cloud Support Engineers to ensure SaaS infrastructure resilience and customer support for AI security, AI runtime, agentic systems, and enterprise LLM deployments. Focus on operational excellence, incident management, and strategic collaboration with Product and Eng teams.

What you'd actually do

  1. Hire, mentor, and retain the team. You’ll be responsible for career trajectories, technical upskilling, and maintaining morale during high-pressure outages.
  2. Define and refine support workflows. You’ll ensure that SLAs (Service Level Agreements) are met without burning out the team.
  3. Act as a primary escalation point for "Severity 1" issues. Coordinate between support, backend engineering, and the customer for rapid resolution.
  4. Work closely with Product and Eng teams to provide feedback on recurring technical pain points, turning support data into product improvements. Also working with Solutions Engineers for any Customer related issues.
  5. Track and report on key performance indicators (KPIs) like CSAT, MTTR, and first-response times.

Skills

Required

  • 5–7 years in Cloud/Technical Support or SRE
  • at least 3 years in a leadership or people-management role
  • Bachelor’s degree in CS, IT, or equivalent experience
  • Deep knowledge of at least one major provider (AWS, Azure, or GCP)
  • Proficiency in Linux and Kubernetes administration
  • Strong Networking knowledge (TCP/IP, DNS, Load Balancing, SSL/TLS)
  • Familiarity with at least one scripting language (Python, Bash, or Go)
  • High empathy
  • grace under fire
  • ability to translate "tech-speak" for stakeholders

Nice to have

  • Infrastructure as Code (IaC): Hands-on experience with Terraform, Pulumi.
  • Orchestration: Experience supporting Kubernetes (K8s) or Docker in production environments.
  • FinOps: Ability to analyze cloud spend and identify optimization/cost-saving opportunities.
  • Security & Compliance: Familiarity with SOC2, HIPAA, or ISO 27001 standards.

What the JD emphasized

  • AI security
  • securing AI runtime
  • agentic systems
  • enterprise LLM deployments
  • advanced security models
  • AI runtime guardrails
  • SLAs (Service Level Agreements)
  • Severity 1