Manager- Software Engineering

New Relic New Relic · Enterprise · Hyderabad, India · Telemetry Data Platform

Engineering Manager for Kubernetes Observability team at New Relic. The role involves leading a team of Go and Python engineers to develop and maintain Kubernetes integrations, including AI-driven SRE Agents for anomaly detection and diagnosis. The team builds data collectors, backend services, and UI dashboards. Experience with distributed systems, container orchestration, and AI/agent workflows is required.

What you'd actually do

  1. Manage, coach, and mentor a committed group of Go and Python engineers, fostering an inclusive culture where they can grow their professional abilities and share code ownership.
  2. Guide architectural decision-making around Kubernetes integrations, OpenTelemetry (OTel) semantic conventions.
  3. Ensure the successful planning and execution of software projects, from the low-level instrumentation agents and Helm charts to the backend services and frontend UI components.
  4. Partner closely with Product Managers, Designers, and other engineering teams to align on product goals and deliver high-quality software at a regular cadence.
  5. Implement DevOps methodologies where the team builds, maintains, and supports its own software, including managing second-layer support rotations and incident response.

Skills

Required

  • 8+ years of overall software engineering experience
  • strong background in large-scale distributed systems
  • high-cardinality data processing
  • 2+ years of experience directly managing and scaling engineering teams
  • Deep understanding of Kubernetes architectures
  • familiarity with scaling applications in complex, multi-cloud containerized environments (EKS, AKS, GKE)
  • Hands-on background with Go and/or Python
  • Linux systems
  • API development
  • Proven experience handling customer incidents
  • diving deep into technical implications
  • mitigating risks in production environments
  • Strong analytical skills
  • autonomous, creative approach to overcoming technical blockers

Nice to have

  • Background in building or managing AI workflows
  • LLM orchestration
  • Model Context Protocol (MCP) servers
  • Deep experience with OpenTelemetry, Prometheus, eBPF, or similar telemetry frameworks
  • Experience developing Helm charts
  • Kubernetes Operators
  • working with different Unix OSes and Windows nodes
  • Upstream contributions to relevant Cloud Native Computing Foundation (CNCF) or open-source infrastructure projects

What the JD emphasized

  • AI-driven K8s SRE Agents that autonomously detect and diagnose complex cluster anomalies
  • AI/Agent Experience: Background in building or managing AI workflows, LLM orchestration, or Model Context Protocol (MCP) servers.

Other signals

  • AI-driven K8s SRE Agents that autonomously detect and diagnose complex cluster anomalies
  • building Go/Python data collectors, highly scalable backend services, intuitive UI dashboards
  • AI/Agent Experience: Background in building or managing AI workflows, LLM orchestration, or Model Context Protocol (MCP) servers.