Software Engineering III - Ai/ml Engineer

JPMorgan Chase JPMorgan Chase · Banking · LONDON, LONDON, United Kingdom · Corporate Sector

Site Reliability Engineer for AI/ML Data Platforms responsible for building scalable, resilient data solutions, coordinating incident management, performing root cause analysis, and developing/supporting AI/ML solutions for troubleshooting. The role emphasizes using enterprise-authorized AI coding assist tools and understanding responsible AI use.

What you'd actually do

  1. Leverages enterprise-authorized AI coding assist tools within the work environment to improve code quality, delivery speed, and productivity across complex deliverables (e.g., code generation/refactoring, unit test creation, documentation), while validating outputs through peer review, automated testing, and secure coding standards; contributes learnings and reusable patterns to improve broader team effectiveness.
  2. Applies knowledge of tools within the Software Development Life Cycle toolchain, including enterprise-authorized AI-assisted development and automation capabilities, to improve the value realized by automation.
  3. Develop and support AI/ML solutions for troubleshooting and incident resolution.
  4. Expertise in application development and support with multiple technologies such as Databricks, Snowflake, AWS, Kubernetes, etc.
  5. Coordinate incident management coverage to ensure effective resolution of application issues.

Skills

Required

  • site reliability culture and principles
  • running production incident calls and managing incident resolution
  • observability (white and black box monitoring, service level objective alerting, and telemetry collection)
  • SLI/SLO/SLA and Error Budgets
  • Python or PySpark for AI/ML modeling
  • reduce toil by building new tools to automate repeated tasks
  • system design, resiliency, testing, operational stability, and disaster recovery
  • risk controls and compliance with departmental and company-wide standards
  • collaborative teamwork

Nice to have

  • 4+ years in an SRE or production support role with AWS Cloud, Databricks, Snowflake or similar Technologies
  • AWS and Databricks certifications

What the JD emphasized

  • enterprise-authorized AI coding assist tools
  • responsible AI use
  • critical evaluate, validate, and refine AI-generated outputs

Other signals

  • AI/ML Data Platforms
  • AI coding assist tools
  • AI/ML solutions for troubleshooting and incident resolution
  • responsible AI use