Lead Product Engineer – Observability (enterprise App Monitoring) - Remote

Allstate Allstate · Insurance · IL · Remote

Lead Product Engineer for enterprise application monitoring within the observability group. Focuses on configuring, implementing, and supporting end-to-end monitoring for applications and platforms across various environments. Leads design, architecture, and engineering of digital products, ensuring successful delivery, testing, maintenance, and documentation. Promotes reliability and aligns software engineering efforts with the organization's vision for Enterprise Monitoring and Observability. The role supports the product lifecycle, interprets technical requirements, designs monitoring solutions, and plans/directs IT activities. Engineers for resilience and performance, develops infrastructure roadmaps, and designs/develops critical metrics. Coaches on environment sizing, deployment, DR, and architectural best practices. Leads inception and D&F of complex application solutions, serving as a subject matter expert.

What you'd actually do

  1. Understand Allstate’s core business drivers, articulate business objectives, align with strategic direction, prepare long-range plans, prioritize objectives, and practice policies.
  2. Apply advanced engineering excellence practices by enforcing technical standards, promoting test-driven development (TDD), and recommending scalable, system-aligned solutions that support both business growth and technology strategy.
  3. Promote system thinking by influencing stakeholders to prioritize enterprise-wide benefits and risk mitigation. Champion the adoption of standardized patterns, reusable components, and architectural roadmaps.
  4. Shape platform strategy by collaborating with platform consultants to design forward-looking standards and patterns that reflect emerging technology trends and evolving business models.
  5. Facilitate alignment across strategic domains by engaging with the appropriate Areas of Responsibility,_ _business architects, and senior leaders to deliver integrated, high-impact solutions. Act as a connector between application and infrastructure domains to ensure cohesive system design.

Skills

Required

  • 3 years working in Agile environments with applied understanding of Site Reliability Engineering (SRE) principles such as SLIs/SLOs, error budgets, and operational readiness.
  • 3 years of development experience using one or more programming languages (Spring Boot/Java, Python, ReactJS, NodeJS), contributing to monitoring tools, automation, or integrations.
  • Proven hands-on experience with Kubernetes, plus practical experience on both Windows and Linux systems in enterprise or cloud environments.
  • Strong written and verbal communication skills of engineering or technical delivery work, including documentation and presentations to technical and executive stakeholders.

Nice to have

  • 5+ years of overall engineering experience, including designing or building software, tools, or automation that enable observability for distributed platforms and infrastructure.
  • 4 years of hands-on experience supporting at least one major observability platform (Datadog, Dynatrace, New Relic, AppDynamics, or OTEL), including configuration, dashboards, alerting, and integrations.
  • Demonstrated interest and emerging experience applying new technologies, including Agentic AI–based solutions that improve observability, automation, or operational efficiency.

What the JD emphasized

  • Agentic AI–based solutions