Sr. Site Reliability Engineer I

Axon Axon · Enterprise · Canada · Remote · 1505 SAAS Ops

This role is for a Senior Site Reliability Engineer focused on building and operating cloud-native services with an emphasis on reliability, security, and automation. The role involves developing foundational platforms and tools, implementing test strategies, writing performant code, debugging distributed systems, and partnering with Identity and Security teams to strengthen user identity and access controls. Experience with Kubernetes, CI/CD, observability tools, and IAM concepts is required.

What you'd actually do

  1. Build robust, easy-to-use foundational platforms and tools that enable engineering teams to provision services rapidly, consistently, securely, and with high test confidence.
  2. Exemplify cloud-native site reliability best practices with a strong emphasis on testability, automation, and resilience in distributed systems.
  3. Design and implement test strategies and frameworks that validate reliability, performance, and security requirements (e.g., integration, end-to-end, chaos/resiliency, and regression suites).
  4. Write code that is performant, maintainable, clear, and concise.
  5. Employ strong problem-solving skills, with the ability to debug problems in cloud-native distributed systems using logs, metrics, traces, and automated diagnostics.

Skills

Required

  • 6+ years of applicable software engineering or SRE experience
  • 3+ years experience managing cloud platforms such as Azure, AWS, or similar.
  • Experience operating in Kubernetes platforms like AKS, EKS, or similar.
  • Experience using managed languages such as Python, Go, C#, Java, or similar with demonstrable API and unit/integration testing experience.
  • Experience utilizing CI/CD platforms to automate provisioning infrastructure, software builds, releases, and automated test pipelines.
  • Experience using observability tools such as APM, logging, and metrics to assist with debugging issues and reliability improvements.
  • Experience designing tooling to simplify the operational management of SaaS/PaaS systems, including test automation and validation tooling.
  • Familiarity with building flexible and testable Infrastructure as Code modules.
  • Experience or strong working knowledge of identity and access management (IAM) concepts and secure authentication/authorization patterns.
  • Empathy to support the needs of software engineers.

Nice to have

  • Experience integrating with or supporting Okta (or similar IdP) and identity standards like OIDC/SAML is a strong asset.

What the JD emphasized

  • Ability to obtain RCMP Enhanced Reliability Status and Secret clearance.