Senior Director of Site Reliability Engineering

JPMorgan Chase · Banking · Palo Alto, CA +1 · Corporate Sector

Senior Director of Site Reliability Engineering to lead firmwide adoption of AI capabilities in reliability operations, setting guardrails for AI-assisted and agentic workflows, and ensuring safe and scalable implementation within a regulated financial environment.

What you'd actually do

Manages all team members development by ensuring they have access to resources needed for their unique development and collaborates across the firm to align team members for mobility opportunities in line with their career aspirations
Leads firmwide reuse-first adoption of enterprise-authorized AI capabilities within the work environment to accelerate reliability planning, operational learning, and delivery execution, with human-in-the-loop validation and appropriate handling of sensitive data
Applies a wide range of tactics and strategies to guide internal executive decisions to achieve substantial goals
Manages multiple stakeholders and complex projects and teams
Implements innovative methods, techniques, and evaluation criteria for projects and people working on highly complex business issues

Skills

Required

Formal training or certification on site reliability engineering concepts and 10+ years applied experience
5+ years of experience leading technologists to manage, anticipate and solve complex technical items within your domain of expertise
Experience leading technologists to manage, anticipate, and solve complex technological issues firmwide
Demonstrated experience leading safe adoption of enterprise-authorized AI capabilities within the work environment at firm scale, including validation practices, data sensitivity considerations, and measurable reliability outcomes
Ability to define governance and decision frameworks for AI-assisted and agentic workflows, including control boundaries, auditability, and human approval checkpoints aligned to resiliency, security, and operational risk expectations
Experience hiring, developing, and recognizing talent
Prior experience influencing across highly matrixed, complex organizations and delivering value at scale
Experience leading complex projects supporting site reliability engineering design, scaling, resilience, and system performance assessments

Nice to have

Experience developing or leading cross-functional teams of technologists
Experience with hiring, developing, and recognizing talent
Experience leading a product as a Product Owner or Product Manager
Practical cloud native experience
Expertise in Computer Science, Computer Engineering, Mathematics, or a related technical field
Experience working at code level

What the JD emphasized

Leads firmwide reuse-first adoption of enterprise-authorized AI capabilities within the work environment to accelerate reliability planning, operational learning, and delivery execution, with human-in-the-loop validation and appropriate handling of sensitive data
Sets enterprise guardrails for AI-assisted and agentic workflows in reliability operations and delivery (e.g., approval controls, traceability/auditability, monitoring, and rollback expectations) aligned to resiliency, security, and risk standards
Demonstrated experience leading safe adoption of enterprise-authorized AI capabilities within the work environment at firm scale, including validation practices, data sensitivity considerations, and measurable reliability outcomes
Ability to define governance and decision frameworks for AI-assisted and agentic workflows, including control boundaries, auditability, and human approval checkpoints aligned to resiliency, security, and operational risk expectations

Other signals

leading firmwide reuse-first adoption of enterprise-authorized AI capabilities
Sets enterprise guardrails for AI-assisted and agentic workflows
Demonstrated experience leading safe adoption of enterprise-authorized AI capabilities at firm scale
Ability to define governance and decision frameworks for AI-assisted and agentic workflows

Read full job description

Join an iconic company and take your career to new heights by leading talented teams in transformative projects. Together, let's push boundaries and achieve unparalleled success.

As a Senior Director of Site Reliability Engineering at JPMorgan Chase within the Infrastructure Platforms and Foundational Services (IPFS) team, you are deemed as a force multiplier at both a line-of-business and firmwide level. Inspire your team members and others to deliver durable and resilient products and services to our customers, define firmwide strategies for reliability, and guide and entrust your team to lead and execute those strategies.

Job Responsibilities

Manages all team members development by ensuring they have access to resources needed for their unique development and collaborates across the firm to align team members for mobility opportunities in line with their career aspirations
Leads firmwide reuse-first adoption of enterprise-authorized AI capabilities within the work environment to accelerate reliability planning, operational learning, and delivery execution, with human-in-the-loop validation and appropriate handling of sensitive data
Applies a wide range of tactics and strategies to guide internal executive decisions to achieve substantial goals
Manages multiple stakeholders and complex projects and teams
Implements innovative methods, techniques, and evaluation criteria for projects and people working on highly complex business issues
Sets enterprise guardrails for AI-assisted and agentic workflows in reliability operations and delivery (e.g., approval controls, traceability/auditability, monitoring, and rollback expectations) aligned to resiliency, security, and risk standards

Required qualifications, capabilities and skills

Formal training or certification on site reliability engineering concepts and 10+ years applied experience. In addition, 5+ years of experience leading technologists to manage, anticipate and solve complex technical items within your domain of expertise ( NAMR/APAC – India/ LATAM/ Hong Kong)
Experience leading technologists to manage, anticipate, and solve complex technological issues firmwide
Demonstrated experience leading safe adoption of enterprise-authorized AI capabilities within the work environment at firm scale, including validation practices, data sensitivity considerations, and measurable reliability outcomes
Ability to define governance and decision frameworks for AI-assisted and agentic workflows, including control boundaries, auditability, and human approval checkpoints aligned to resiliency, security, and operational risk expectations
Experience hiring, developing, and recognizing talent
Prior experience influencing across highly matrixed, complex organizations and delivering value at scale
Experience leading complex projects supporting site reliability engineering design, scaling, resilience, and system performance assessments

Preferred qualifications, capabilities and skills

Experience developing or leading cross-functional teams of technologists
Experience with hiring, developing, and recognizing talent
Experience leading a product as a Product Owner or Product Manager
Practical cloud native experience
Expertise in Computer Science, Computer Engineering, Mathematics, or a related technical field
Experience working at code level

This position is subject to Section 19 of the Federal Deposit Insurance Act. As such, an employment offer for this position is contingent on JPMorganChase’s review of criminal conviction history, including pretrial diversions or program entries.

Join an iconic company and take your career to new heights by leading talented teams in transformative projects. Together, let's push boundaries and achieve unparalleled success.

Job Responsibilities

Manages all team members development by ensuring they have access to resources needed for their unique development and collaborates across the firm to align team members for mobility opportunities in line with their career aspirations
Leads firmwide reuse-first adoption of enterprise-authorized AI capabilities within the work environment to accelerate reliability planning, operational learning, and delivery execution, with human-in-the-loop validation and appropriate handling of sensitive data
Applies a wide range of tactics and strategies to guide internal executive decisions to achieve substantial goals
Manages multiple stakeholders and complex projects and teams
Implements innovative methods, techniques, and evaluation criteria for projects and people working on highly complex business issues
Sets enterprise guardrails for AI-assisted and agentic workflows in reliability operations and delivery (e.g., approval controls, traceability/auditability, monitoring, and rollback expectations) aligned to resiliency, security, and risk standards

Required qualifications, capabilities and skills

Formal training or certification on site reliability engineering concepts and 10+ years applied experience. In addition, 5+ years of experience leading technologists to manage, anticipate and solve complex technical items within your domain of expertise ( NAMR/APAC – India/ LATAM/ Hong Kong)
Experience leading technologists to manage, anticipate, and solve complex technological issues firmwide
Demonstrated experience leading safe adoption of enterprise-authorized AI capabilities within the work environment at firm scale, including validation practices, data sensitivity considerations, and measurable reliability outcomes
Ability to define governance and decision frameworks for AI-assisted and agentic workflows, including control boundaries, auditability, and human approval checkpoints aligned to resiliency, security, and operational risk expectations
Experience hiring, developing, and recognizing talent
Prior experience influencing across highly matrixed, complex organizations and delivering value at scale
Experience leading complex projects supporting site reliability engineering design, scaling, resilience, and system performance assessments

Preferred qualifications, capabilities and skills

Experience developing or leading cross-functional teams of technologists
Experience with hiring, developing, and recognizing talent
Experience leading a product as a Product Owner or Product Manager
Practical cloud native experience
Expertise in Computer Science, Computer Engineering, Mathematics, or a related technical field
Experience working at code level