Staff Site Reliability Engineer- Splunk Expert

Okta Okta · Enterprise · Bangalore, India · SW Eng - Infrastructure-672

Staff Site Reliability Engineer with deep expertise in Splunk and Grafana to own and evolve the observability ecosystem. The role involves architecting a comprehensive, scalable telemetry platform, optimizing Splunk performance and cost-effectiveness, and integrating it with automated workflows. Responsibilities include treating infrastructure as code using Terraform and automating agent/collector deployment with Go, Python, or Ruby.

What you'd actually do

  1. Splunk Architecture & Optimisation: Lead the design and tuning of Splunk environments. Optimise indexer performance, search efficiency, and data models to ensure rapid troubleshooting and cost-efficiency.
  2. Advanced Visualisation: Architect and maintain sophisticated Grafana dashboards that correlate disparate data sources into a single pane of glass for real-time system health.
  3. Automated Infrastructure: Design, build, and maintain scalable observability infrastructure using tools like Terraform.
  4. Pipeline Engineering: Optimise the collection, processing, and storage of telemetry data (Metrics, Logs, Traces) to ensure high reliability and low latency.
  5. Workflow Automation: Develop custom Splunk workflows and integrations that trigger automated responses to system events, reducing Mean Time to Resolution (MTTR).

Skills

Required

  • Splunk administration
  • Splunk search optimisation (SPL)
  • Splunk architecture
  • Grafana dashboard creation
  • Terraform
  • Go
  • Python
  • Ruby
  • OpenTelemetry (OTel)
  • Prometheus
  • Linux internals
  • networking (TCP/IP, DNS, Load Balancing)
  • Kubernetes/EKS

Nice to have

  • Distributed tracing (Jaeger, Tempo, Honeycomb)
  • Splunk for security orchestration (SOAR)
  • SIEM-related workflows
  • AWS
  • Azure
  • GCP

What the JD emphasized

  • Splunk Mastery
  • Grafana Expertise
  • SRE Mindset
  • Programming Proficiency
  • Telemetry Standards
  • Distributed Systems