Ai/llm Sre Lead Software Engineer

JPMorgan Chase JPMorgan Chase · Banking · Columbus, OH +1 · Corporate Sector

Lead Software Engineer focused on SRE for AI-powered applications and infrastructure, emphasizing reliability, scalability, and automation. Responsibilities include designing AI-driven alerting and incident response systems, building observability, and mentoring engineers on AIOps.

What you'd actually do

  1. Ensure reliability, scalability, and performance of AI-assisted application and platform operations.
  2. Design and implement AI-driven solutions for intelligent alerting, noise reduction & auto-correlation systems.
  3. Build and maintain observability, monitoring, and telemetry for AI application and platforms.
  4. Build and support automation for alerting, anomaly detection, and self-healing workflows.
  5. Define and execute the roadmap for AI-assisted SRE and observability.

Skills

Required

  • SRE
  • DevOps
  • Platform Engineering
  • AWS
  • LLM APIs
  • OpenTelemetry
  • Grafana
  • Prometheus
  • ELK
  • CloudWatch
  • CI/CD
  • automation
  • distributed systems
  • microservices
  • cloud architectures
  • Python

Nice to have

  • AI-powered coding assistants
  • prompt engineering
  • embeddings
  • RAG pipelines
  • operational copilots
  • chatbots
  • Go

What the JD emphasized

  • 5+ years applied experience
  • Strong hands-on experience with AWS
  • Hands-on experience with AWS Bedrock, OpenAI, or LLM APIs
  • Expertise in observability tools
  • Proven track record in automation, operational tooling, and event-driven workflows.

Other signals

  • AI-driven solutions for intelligent alerting
  • AI-assisted SRE and observability
  • LLM APIs