Reliability Engineer, Global Reliability Intelligence Programs

Amazon Amazon · Big Tech · London, United Kingdom · Fulfillment & Operations Management

Reliability Engineer focused on Root Cause Analysis (RCA) and Failure Modes and Effects Analysis (FMEA) to identify and eliminate causes of failures, improve system reliability, and drive measurable improvements in uptime and performance. The role involves analyzing data, building dashboards, and collaborating with cross-functional teams to implement reliability improvements and enhance RCA/FMEA tools.

What you'd actually do

  1. Lead Root Cause Analysis (RCA) for high-impact and recurring failures, driving deep-dive investigations to identify true root causes and ensure effective, lasting corrective actions
  2. Develop, maintain, and continuously improve Failure Modes and Effects Analysis (FMEA) to proactively identify risks, prioritize mitigation, and prevent future failures
  3. Analyze equipment and operational data to identify trends, systemic issues, and performance gaps, translating findings into actionable reliability improvements
  4. Build and maintain BI dashboards, automated reports, and performance metrics (e.g., uptime, MTBF, failure rates) to enable data-driven decision-making
  5. Lead cross-functional execution of reliability improvements by partnering with operations, engineering, maintenance, and external vendors across multiple sites and regions

Skills

Required

  • Knowledge of Microsoft Excel at an advanced level, including: pivot tables, macros, index/match, vlookup, VBA, data links, etc.
  • Experience with data scripting languages (e.g., SQL, Python, R, or equivalent) or statistical/mathematical software (e.g., R, SAS, Matlab, or equivalent)
  • Knowledge of BI analytics, reporting or visualization tools like Tableau, AWS QuickSight, Cognos or other third-party tools
  • Experience working with large-scale data mining and reporting tools (i.e. SQL, MS Power Query, Python), or experience in building financial and operational reports/data sets that inform business decision-making
  • Experience using data to drive root cause analysis for making business decisions with Excel or other analytical tools
  • Experience with predictive and preventative maintenance, repair, troubleshooting, and diagnostics on material handling equipment (MHE) and automated conveyor systems
  • Experience in at least one of these technology areas: DevOps, serverless, software development and design, CI/CD, AI/ML, Storage, Networking or Databases
  • Knowledge of infrastructure automation delivered through the software development lifecycle in an API-enabled environment, including agile development, software architecture/patterns, and modern cloud services
  • Experience in written and verbal communication with the ability to present complex technical information in a clear and concise manner to executives and non-technical leaders

Nice to have

  • Knowledge of data modeling and data pipeline design
  • Experience with industry standard tools and scripting languages (Python or Perl) for automation
  • Experience with full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations, or experience contributing to the architecture and design (architecture, design patterns, reliability and scaling) of new and current systems
  • Experience in software development, or experience with automation and any version control tools and experience with network troubleshooting tools (telnet, test-netconnection, tracert, tracetcp, iperf, ntttcp, dig, and packet capture tools)
  • Knowledge of overall system architecture, scalability, reliability, and performance in a database environment
  • Experience with various types of research methodologies is key, including quant, qual, 1P & 3P data, trend analysis & forecasting, etc.
  • Experience applying machine learning algorithms to find business-critical patterns in operational data
  • Experience developing and presenting recommendations of new metrics allowing better understanding of the performance of the business
  • Experience in identifying, leading, and executing opportunities to improve, automate, standardize or simplify finance or business tools and processes, or experience working with stakeholders