Site Reliability Engineer Iii- Payment Technology

JPMorgan Chase JPMorgan Chase · Banking · Singapore · Commercial & Investment Bank

Site Reliability Engineer III focused on Payment Technology within JPMorgan Chase's Corporate and Investment Bank. The role involves configuring, maintaining, monitoring, and optimizing applications and their infrastructure using code and cloud technologies. Key responsibilities include designing and implementing observability, availability, reliability, and scalability solutions, managing infrastructure as code, resolving complex problems, and supporting SRE best practices. Requires proficiency in Java/Spring Boot or Python, SRE principles, observability tools, CI/CD tools, and container orchestration. Familiarity with networking and cloud platforms is also expected.

What you'd actually do

  1. Collaborates with other software engineers and teams to design, develop, test, and implement observability, availability, reliability, scalability, and solutions in their applications
  2. Implements infrastructure, configuration, and network as code for the applications and platforms in your remit
  3. Collaborates with technical experts, key stakeholders, and team members to resolve complex problems
  4. Understands service level indicators and utilizes service level objectives to proactively resolve issues before they impact customers
  5. Supports the adoption of site reliability engineering best practices within your team

Skills

Required

  • Bachelor's Degree in Computer Science, Cybersecurity, Data Science, or related disciplines
  • 3+ years of SRE or System engineering or software development experience
  • Proficient in site reliability culture and principles
  • Proficient in at least one programming language: Java/Spring Boot or Python
  • Proficient knowledge of software applications and technical processes within a given technical discipline (e.g., Cloud, artificial intelligence, Observability etc.)
  • Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Otel, Dynatrace, Prometheus, Datadog, Splunk, and others
  • Experience with continuous integration and continuous delivery tools like Jenkins, GitLab, or Terraform
  • Familiarity with container and container orchestration such as ECS, Kubernetes, and Docker
  • Familiarity with troubleshooting common networking technologies and issues
  • Ability to contribute to large, collaborative teams by presenting information logically and compellingly, with limited supervision, while proactively recognizing roadblocks and showing interest in learning technologies that drive innovation.
  • Ability to identify new technologies and relevant solutions to ensure design constraints are met by the software team.

Nice to have

  • Familiarity with modern observability tools such as Otel, Dynatrace, Grafana, Prometheus, and Datadog.
  • Proficiency in backend technologies including Java/Spring and Python.
  • Experience working with Kubernetes for container orchestration.
  • Working knowledge of cloud platforms and infrastructure-as-code tools, specifically AWS and Terraform.