Staff Software Engineer - Grafana Cloud K6 | Spain | Remote

Grafana Labs Grafana Labs · Data AI · Canada, Germany, Ireland, Spain, UK, United States · Remote · R&D: Performance testing (k6)

Staff Software Engineer role at Grafana Labs focused on building and operating performance testing products (Grafana k6, Grafana Cloud k6, Grafana Cloud Synthetics). The role emphasizes establishing and scaling a culture of engineering excellence, reliability, and operational ownership. Responsibilities include hands-on coding, guiding teams on distributed systems, maturing SRE practices, establishing reliability frameworks, and influencing product direction. The role also mentions using AI coding assistants and accessing frontier models as part of the workflow.

What you'd actually do

  1. Contribute hands-on to the codebase by designing and implementing production-quality software.
  2. Guide teams in the design, development, evolution, and operation of large-scale, distributed cloud systems.
  3. Build and scale a strong culture of operational excellence by defining standards and coaching teams to own reliability and availability.
  4. Help mature SRE practices, including incident response and PIRs, on-call readiness, runbooks, alerting, observability, and release/change management.
  5. Establish reliability frameworks such as SLIs/SLOs and error budgets, and use them to guide prioritization and engineering trade-offs.

Skills

Required

  • Strong programming background in a modern language (Python and Go are primary, but prior experience is not required)
  • Experience designing, building, and operating large-scale distributed systems
  • Strong experience with SRE practices, including operating and evolving production systems at scale
  • Strong understanding of reliability engineering concepts (e.g. incident management, observability, and failure modes)
  • Strong experience of defining or applying SLIs/SLOs, error budgets, or reliability metrics
  • Experience with test automation, including performance and functional testing
  • Ability to influence engineering practices through clear technical communication, reviews, and collaboration
  • Strong interpersonal skills and ability to work effectively across teams
  • Familiarity with modern software engineering processes and delivery practices
  • Self-driven and comfortable operating with a high degree of autonomy

What the JD emphasized

  • large-scale distributed cloud systems
  • SRE practices
  • reliability engineering concepts
  • SLIs/SLOs
  • performance and functional testing
  • modern software engineering processes and delivery practices