Staff Backend Engineer - Adaptive Telemetry | Germany | Remote

Grafana Labs Grafana Labs · Data AI · EMEA, Germany, Spain, Sweden, United Kingdom · Remote · R&D : Databases

Staff Backend Engineer role at Grafana Labs, focusing on Adaptive Telemetry for their observability platform. The role involves driving technical strategy, leading project delivery, owning system architecture, reliability, performance, and cost for critical systems. It emphasizes improving observability, automation, and operational readiness, mentoring engineers, and representing engineering internally and externally. The role mentions using AI coding assistants and having access to frontier models as part of the developer workflow, but the core function is backend engineering for telemetry systems, not AI model development.

What you'd actually do

  1. Drive technical strategy and roadmap. Proactively define the architectural vision, prioritize work that unlocks major product or platform improvements, and influence product and engineering decisions.
  2. Lead end-to-end delivery of large, cross-functional projects. Own planning, design, execution, rollout and long-term operation of large initiatives.
  3. Own architecture, reliability, performance and cost for critical systems. Make pragmatic architecture choices that balance scalability, availability, latency and cost while ensuring systems remain maintainable and evolvable.
  4. Define SLOs/SLIs and lead incident response. Establish measurable reliability targets, run high-severity incident response, lead blameless post-mortems, and drive systemic fixes and automation to prevent recurrence.
  5. Improve observability, automation and operational readiness. Champion telemetry, alerting, runbooks, capacity planning and automation efforts that reduce toil, speed debugging and lower MTTR.

Skills

Required

  • Distributed systems
  • Systems design
  • Backend engineering
  • Observability
  • Reliability
  • Performance optimization
  • Cost optimization
  • Incident response
  • Automation
  • Mentoring

Nice to have

  • Experience with telemetry databases (Mimir, Loki, Tempo, Pyroscope)
  • Experience with open-source projects
  • Cloud platforms

What the JD emphasized

  • Proven delivery of large distributed systems.
  • Strong systems-design instincts.