Reliability Engineer at Apple

What you'd actually do

architecting and orchestrating the high-performance, scalable enterprise platforms that underpin our groundbreaking Data, ML and inferencing platforms

ensuring unparalleled availability, optimal performance, and minimal latency for our high-throughput applications

management of diverse workloads across ML/Data/Inference platforms

exploration and evaluation of latest open source technologies and innovative solutions

Skills

Required

AWS/GCP or Kubernetes Experience
Proficient programming knowledge in one of the following areas: Python, Java, or Go Programming
ability to read and explain open source codebase
Understanding or exposure in Operating Systems or Networking and Security Principles

Nice to have

Exposure to Data processing and Model Training or FineTuning methodologies
Exposure to Spark/Flink and other modern cloud native big data technologies
Exposure to Cloud managed services like AWS BedRock/GCP Vertex AI
Exposure to various LLM infrastructure like GPUs, TPUs & Inferentia
Understanding of Networking concepts on Cloud, like VPCs, DNS, Security Groups, Kubernetes network model
Expertise in Performance tuning JVMs & Operating Systems like Linux

Are you meticulously organized and highly observant? Join our Information Systems and Technology group and play a vital function on one of two Apple teams: Software and Services and Corporate Functions. From Apple ID to the Apple website to our data centers around the globe, our diverse collection of engineers, designers and creators manage the massive systems and services that so many people rely on every single day! The Applied Machine Learning team has been at the forefront of accelerating digital transformation through machine learning across Apple's enterprise ecosystem. The proven ML Platforms, Solutions, and Services provide a comprehensive suite of capabilities to achieve efficiency, agility, and innovation at Apple scale—serving business-critical needs across Apple’s enterprise. We are looking for a talented engineer to join our team to bring passion for infrastructure and distributed systems, to build world-class platforms/products at a very large scale across cloud environments.

Description

Embark on a transformative journey as a Software Engineer within Apple's esteemed Applied Machine Learning Team, where you will assume a pivotal role in architecting and orchestrating the high-performance, scalable enterprise platforms that underpin our groundbreaking Data, ML and inferencing platforms. You shall be entrusted with the stewardship of ensuring unparalleled availability, optimal performance, and minimal latency for our high-throughput applications, thereby directly influencing and elevating the customer experience. Your responsibilities will encompass the management of diverse workloads across ML/Data/Inference platforms, coupled with the exploration and evaluation of latest open source technologies and innovative solutions. A proven aptitude for outstanding interpersonal communication and the capacity to collaborate seamlessly across multifaceted business and technical teams are paramount. We are looking for enthusiastic engineers with interest in one of the following areas: • Platform Reliability Engineer • Big Data Engineer • ML Engineer

Minimum Qualifications

Bachelor’s Degree in Computer Science, Computer Engineering or equivalent. AWS/GCP or Kubernetes Experience Proficient programming knowledge in one of the following areas: Python, Java, or Go Programming and ability to read and explain open source codebase Understanding or exposure in Operating Systems or Networking and Security Principles Relevant Internship experience

Preferred Qualifications

Excellent analytical & problem solving skills Exposure to Data processing and Model Training or FineTuning methodologies Exposure to Spark/Flink and other modern cloud native big data technologies Exposure to Cloud managed services like AWS BedRock/GCP Vertex AI Exposure to various LLM infrastructure like GPUs, TPUs & Inferentia Understanding of Networking concepts on Cloud, like VPCs, DNS, Security Groups, Kubernetes network model Expertise in Performance tuning JVMs & Operating Systems like Linux

At Apple, we believe accessibility is a fundamental human right. You’ll find that idea reflected in everything here — in our culture, our benefits and our digital tools. By welcoming as many perspectives as possible, we help you build a career where you feel like you belong.

Learn about accessibility in Apple’s workplace

Description

Minimum Qualifications

Preferred Qualifications

Learn about accessibility in Apple’s workplace

Reliability Engineer

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Description

Minimum Qualifications

Preferred Qualifications

Description

Minimum Qualifications

Preferred Qualifications