Ci Systems Engineer (ai Failure Analysis), Developer Workflows

Apple Apple · Big Tech · Cupertino, CA · Software and Services

This role focuses on building AI-assisted systems to analyze CI build and test failures, transforming raw data into actionable insights for OS engineers. The engineer will design and implement AI-powered workflows for failure summarization, pattern identification, and distinguishing signal from noise, integrating AI capabilities into developer tools to accelerate the software development process.

What you'd actually do

  1. Develop AI-assisted failure analysis systems that transform raw CI data into actionable insights, helping developers quickly diagnose root causes and understand test failure patterns
  2. Design and implement AI-powered triage workflows that intelligently summarize failures, identify patterns across large result sets, and distinguish signal from noise
  3. Build and integrate tools that give AI systems structured access to CI data, enabling intelligent querying and analysis
  4. Optimize data structures and database design for fast storage and deduplication of build and test failures, ensuring AI systems have efficient access to the context they need
  5. Drive performance improvements and optimization initiatives for results storage and query latency to meet developer needs

Skills

Required

  • BS in Computer Science or equivalent professional experience
  • 8+ years of software engineering experience, preferably 2+ years focused on CI infrastructure, data systems, or failure analysis
  • Experience applying AI/ML or LLM-based approaches to software development workflows, tooling, or automation
  • Proficiency in one or more languages suited to systems and data work (Swift, Scala, Python, Go, C/C++, etc.)
  • Proven ability to work independently on complex problems and collaborate effectively on team initiatives
  • Strong communication skills to collaborate with diverse teams and translate complex failure data into developer-friendly insights
  • Demonstrated experience in designing or contributing to systems that handle scale, data integrity, and query performance

Nice to have

  • Experience building or integrating with AI agents using the latest-available tools such as Skills, MCP Servers, Plugins, or LLM-powered tooling
  • Proven experience integrating AI into developer workflows with measurable impact on engineering efficiency; code review, testing, debugging, triage, or productivity tooling
  • Familiarity with machine learning techniques applied to failure correlation, anomaly detection, or pattern recognition
  • Deep expertise in data storage, retrieval, and analysis, including experience with relational and NoSQL databases
  • Experience building data pipelines or working with distributed data processing frameworks
  • Background working on large-scale data systems, observability platforms, or analytics infrastructure
  • Experience with CI/CD failure analysis, test result aggregation, or build system diagnostics, including root cause analysis, diagnostic tooling, and observability practices
  • Knowledge of iOS or macOS internals, development environments, build agents, and testing infrastructure

What the JD emphasized

  • 8+ years of software engineering experience, preferably 2+ years focused on CI infrastructure, data systems, or failure analysis
  • Experience applying AI/ML or LLM-based approaches to software development workflows, tooling, or automation
  • Proven ability to work independently on complex problems and collaborate effectively on team initiatives
  • Demonstrated experience in designing or contributing to systems that handle scale, data integrity, and query performance

Other signals

  • AI-assisted systems for failure analysis
  • Transforming raw CI data into actionable insights
  • AI-powered triage workflows
  • Summarize failures, identify patterns, distinguish signal from noise