Staff Software Engineer - Grafana Cloud K6 | Canada | Remote

Grafana Labs Grafana Labs · Data AI · Canada, United States · Remote · R&D: Performance testing (k6)

This role is for a Staff Software Engineer focused on establishing and scaling a cross-team culture of engineering excellence by setting standards and guiding adoption of strong DevOps/SRE practices. The role will also expand into broader application and product development leadership, contributing architectural and technical depth. While the company uses AI tools and provides access to frontier models, the core responsibilities are in SRE and product development, not AI model building.

What you'd actually do

  1. Build and scale a strong culture of operational excellence by defining standards and coaching teams to own reliability and availability.
  2. Drive mature DevOps/SRE practices, including incident response and PIRs, on-call readiness, runbooks, alerting, observability, and release/change management.
  3. Establish reliability frameworks such as SLIs/SLOs and error budgets, and use them to guide prioritization and engineering trade-offs.
  4. Provide visibility into system health through clear operational metrics and reliability reporting.
  5. Guide teams in the design, development, evolution, and operation of large-scale, distributed cloud systems.

Skills

Required

  • DevOps/SRE practices
  • operating and evolving production systems at scale
  • modern language programming
  • large-scale distributed systems design, building, and operation
  • reliability engineering concepts
  • test automation
  • clear technical communication
  • interpersonal skills
  • modern software engineering processes
  • delivery practices

Nice to have

  • containerized and cloud-native systems
  • observability tooling and platforms
  • Python
  • Go
  • JavaScript
  • Jsonnet