Site Reliability Engineer II

Axon Axon · Enterprise · Canada · Remote · 1505 SAAS Ops

Site Reliability Engineer II role focused on building and maintaining cloud-native platforms and tools to enable engineering teams to provision services rapidly, consistently, securely, and cost-effectively. Emphasizes best practices in reliability, code quality, problem-solving in distributed systems, and influencing architectural patterns.

What you'd actually do

  1. Build robust, easy-to-use foundational platforms and tools that enable engineering teams to provision services rapidly, consistently, securely, and cost-effective.
  2. Exemplify cloud-native site reliability best practices.
  3. Write code that is performant, maintainable, clear, and concise.
  4. Employ strong problem-solving skills, with the ability to debug problems in cloud-native distributed systems.
  5. Influence and educate the engineering organization to adopt new and improved architectural patterns.

Skills

Required

  • 5+ years of applicable experience
  • Experience managing cloud platforms such as Azure, AWS, or similar.
  • Experience using managed languages such as Python, Go, C#, Java, or similar.
  • Experience operating in Kubernetes platforms like AKS, EKS, or similar.
  • Experience utilizing CI/CD platforms to automate provisioning infrastructure, software builds, tests, and releases.
  • Experience using observability tools such as APM, logging, and metrics to assist with debugging issues.
  • Experience using Infrastructure as Code tools for provisioning infrastructure such as Terraform, AWS CloudFormation, or similar.
  • Builder-operator mindset with proven production ownership (uptime, SLOs, on-call, incident leadership).
  • Empathy to support the needs of software engineers.

What the JD emphasized

  • proven production ownership (uptime, SLOs, on-call, incident leadership)