Product Manager – Enterprise AI Operations & Observability

Eli Lilly Eli Lilly · Pharma · Hyderabad, India

Product Manager for Enterprise AI Operations & Observability at Eli Lilly, focusing on driving strategy, execution, and adoption of AIOps and Observability capabilities to improve system reliability and reduce operational toil. This role involves program management, stakeholder coordination, vendor management, and championing data-driven incident management practices.

What you'd actually do

  1. Lead end-to-end program management for AIOps and Observability initiatives, from roadmap planning through delivery and adoption.
  2. Define and drive the strategic vision for how AI/ML-driven insights, automated remediation, and unified observability platforms reduce operational toil and improve system reliability.
  3. Coordinate across platform engineering, SRE, infrastructure, application development, and vendor teams to align priorities and resolve dependencies.
  4. Establish and track KPIs such as MTTD, MTTR, alert noise reduction, incident volume, and automation coverage.
  5. Manage vendor relationships and tool evaluations for observability platforms (e.g., Datadog, Splunk, Dynatrace, Elastic, Grafana, ServiceNow ITOM) and AIOps solutions.

Skills

Required

  • Technical project management
  • Infrastructure, SRE, platform engineering, or ITOps domains
  • Observability pillars (metrics, logs, traces)
  • Modern monitoring ecosystem
  • AIOps concepts
  • Managing programs across multiple engineering teams, geographies, and tool ecosystems
  • Translating technical concepts to non-technical audiences
  • Delivering large-scale platform or operations transformation programs
  • Stakeholder management
  • Experience with at least one major observability platform (Datadog, Splunk, Dynatrace, Elastic, New Relic, Grafana stack)
  • Cloud-native environments (AWS, Azure, GCP)
  • Container orchestration (Kubernetes)
  • ServiceNow ITOM/ITSM integration patterns

Nice to have

  • OpenTelemetry
  • eBPF
  • Observability pipeline tools (Cribl, Vector)
  • ITIL, PMP, SAFe, or equivalent certifications
  • FinOps principles

What the JD emphasized

  • AIOps and Observability capabilities
  • AI/ML-driven insights
  • automated remediation
  • unified observability platforms
  • observability pillars (metrics, logs, traces)
  • AIOps concepts
  • anomaly detection
  • event correlation
  • noise suppression
  • root cause analysis
  • automated remediation
  • large-scale platform or operations transformation programs
  • observability-as-code practices
  • SLO-based alerting
  • distributed tracing