Staff Software Engineer - Platform, Syseng | Usa | Remote

Grafana Labs Grafana Labs · Data AI · Canada, United States · Remote · R&D: Platform

Staff Software Engineer role focused on building and scaling Grafana Cloud's observability platform, which handles millions of metrics, logs, and traces per second. The role involves working with distributed systems, cloud-native architectures, and operational practices to improve performance, reliability, and efficiency. While the company uses AI tools for development and mentions AI in its product, the core of this role is platform engineering, not direct AI/ML model development.

What you'd actually do

  1. We are hiring for the Platform SysEng squad. This is an accelerated, cross-cutting squad that is focused on the maturity and scalability of the platform.
  2. Currently, SysEng is working across engineering with a goal of reducing new region build timelines to meet customer demands.
  3. We’re part of a Platform Engineering group that manages infrastructure for the teams that are building some of the most cherished tools - Grafana, Mimir, Loki, Tempo, Pyroscope to name a few.
  4. You enjoy working with engineers, as well as with the management structures that are there to support you and enable you and your team to do your very best.
  5. You are comfortable working in a remote-first company; communication is key.

Skills

Required

  • Go
  • Python
  • Shell
  • distributed systems
  • cloud-native architectures
  • microservices
  • containers/Kubernetes
  • IaC
  • system design
  • latency
  • consistency
  • availability
  • scaling
  • cost
  • reliability
  • performance

Nice to have

  • experience with operating your code
  • experience with developer productivity tools
  • AI coding assistants

What the JD emphasized

  • operating your code
  • large distributed systems
  • shipping and operating complex systems
  • system design
  • cloud-native architectures
  • operational practices
  • Reliability and performance ownership