What you'd actually do

Responsible for the development and design of data integrations and data ingestion processes for Apple internal and external data.

Build and maintain data pipelines for ingesting, processing, and transforming unstructured data sources, such as customer feedback, social media data, or sales call recordings.

Develop data quality monitoring and validation processes specifically for AIML datasets, including identifying and addressing data bias.

Work with data scientists to understand data requirements for AIML model training and deployment, ensuring data is available in the appropriate format and quality.

Play an active role in the development and maintenance of user documentation, including data models, mapping rules, and data dictionaries.

Skills

Required

5+ years of experience in designing, building, and maintaining scalable data solutions for large-scale analytics.
Proficiency in SQL and development experience with cloud database environments like Snowflake, Redshift or Databricks.
Proficiency in programming languages like Python, Java or R and open-source frameworks for distributed processing like Hadoop and Spark.
Hands on experience using development tools in a modern cloud data stack for code management, versioning using Git, CI/CD tools, automation and orchestration using Apache Airflow or others and monitoring & alerting.
Architecting and developing data pipelines through ETL tools, API integrations with systematic and cloud based source systems.
Strong understanding of data modeling, data warehousing, and ETL concepts
Experience with Cloud platforms AWS, Azure, Google Cloud
Handing unstructured data (e.g., JSON, Parquet, text, images, audio, video).
Experience with data governance and observability tools (e.g., Datahub, Collibra, etc…)
Hands-on experience using dbt for transforming data in a cloud data warehouse.
Experience building and maintaining dbt models, tests, and documentation
Understanding of writing dbt macros and Jinja templating
Experience articulating and translating business questions into data solutions and proven ability to lead development projects from start to finish.
Experience and understanding of API development (REST, GraphQL, gRPC).
Broad knowledge of web standards relating to REST, HTTP, JSON, etc.
Experience with basic frontend dev (HTML, CSS, JavaScript, Bootstrap, JQuery, etc).
Experience with data labeling and annotation tools and processes.
Familiarity with AI/ML model development lifecycle and data needs for training and deployment.
Able to balance competing priorities, long-term projects, and ad hoc requirements.
Ability to work in a fast-paced, dynamic, constantly evolving business environment.

Nice to have

Jupyter Labs, Dataiku experience is a plus
Familiarity with dbt best practices (modular models, sources, refs, macros)
BS or MS in Computer Science or equivalent industry experience.

What the JD emphasized

building data infrastructure to support AIML initiatives

building and maintaining data pipelines for both structured and unstructured data, enabling the development and deployment of AIML models

Develop data quality monitoring and validation processes specifically for AIML datasets, including identifying and addressing data bias

Work with data scientists to understand data requirements for AIML model training and deployment

Other signals

building data infrastructure to support AIML initiatives

building and maintaining data pipelines for both structured and unstructured data, enabling the development and deployment of AIML models

Develop data quality monitoring and validation processes specifically for AIML datasets, including identifying and addressing data bias

Work with data scientists to understand data requirements for AIML model training and deployment

Imagine what you could do here. At Apple, new ideas have a way of becoming outstanding products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. Apple’s Sales organization generates the revenue needed to fuel our ongoing development of products and services. Apple's US Sales Technology Team is looking for a talented individual who is passionate about crafting, implementing, and operating solutions that have a direct and measurable impact on Apple Sales and its customers. We also leverage Artificial Intelligence and Machine Learning (AIML) to enhance our sales processes, and this role will be critical in building the data infrastructure to support those initiatives. As a Data Integration Engineer, you will develop infrastructure, systems, services, and tools for automating sales processes. We’re looking for an exceptional engineer that lives at the intersection of development, operations, data, and systems engineering to build solutions for large-scale continuous data transformation and delivery. This role will specifically focus on building and maintaining data pipelines for both structured and unstructured data, enabling the development and deployment of AIML models.

Description

Responsible for the development and design of data integrations and data ingestion processes for Apple internal and external data. Develop data models and mapping rules to transform raw data into actionable insights and reports. Design and implement a semantic layer that integrates analytics data from multiple sources in an efficient and effective manner. Collaborate with the analytics and data science teams to understand their requirements and deliver solutions that meet their needs. Collaborate with internal business partners, internal technology resources (database, system, networking), external vendors, and partners. Play an active role in the development and maintenance of user documentation, including data models, mapping rules, and data dictionaries. Ensure data quality and accuracy by developing data validation and reconciliation processes. Build and maintain data pipelines for ingesting, processing, and transforming unstructured data sources, such as customer feedback, social media data, or sales call recordings. Develop data quality monitoring and validation processes specifically for AIML datasets, including identifying and addressing data bias. Work with data scientists to understand data requirements for AIML model training and deployment, ensuring data is available in the appropriate format and quality. Implement data governance policies and procedures to ensure the responsible and ethical use of data in AIML applications.

Minimum Qualifications

We’re looking for someone with an eagerness and ability to learn new skills and solve dynamic problems in an encouraging and expansive environment. 5+ years of experience in designing, building, and maintaining scalable data solutions for large-scale analytics. Proficiency in SQL and development experience with cloud database environments like Snowflake, Redshift or Databricks. Proficiency in programming languages like Python, Java or R and open-source frameworks for distributed processing like Hadoop and Spark. Hands on experience using development tools in a modern cloud data stack for code management, versioning using Git, CI/CD tools, automation and orchestration using Apache Airflow or others and monitoring & alerting. Architecting and developing data pipelines through ETL tools, API integrations with systematic and cloud based source systems. Strong understanding of data modeling, data warehousing, and ETL concepts (Jupyter Labs, Dataiku experience is a plus). Experience with Cloud platforms AWS, Azure, Google Cloud Handing unstructured data (e.g., JSON, Parquet, text, images, audio, video). Experience with data governance and observability tools (e.g., Datahub, Collibra, etc…)Hands-on experience using dbt for transforming data in a cloud data warehouse. Experience building and maintaining dbt models, tests, and documentation Familiarity with dbt best practices (modular models, sources, refs, macros) Understanding of writing dbt macros and Jinja templating Experience articulating and translating business questions into data solutions and proven ability to lead development projects from start to finish. Experience and understanding of API development (REST, GraphQL, gRPC). Broad knowledge of web standards relating to REST, HTTP, JSON, etc. Experience with basic frontend dev (HTML, CSS, JavaScript, Bootstrap, JQuery, etc). Experience with data labeling and annotation tools and processes. Familiarity with AI/ML model development lifecycle and data needs for training and deployment. Able to balance competing priorities, long-term projects, and ad hoc requirements. Ability to work in a fast-paced, dynamic, constantly evolving business environment.

Preferred Qualifications

Education BS or MS in Computer Science or equivalent industry experience.

At Apple, we believe accessibility is a fundamental human right. You’ll find that idea reflected in everything here — in our culture, our benefits and our digital tools. By welcoming as many perspectives as possible, we help you build a career where you feel like you belong.

Learn about accessibility in Apple’s workplace

Description

Minimum Qualifications

Preferred Qualifications

Education BS or MS in Computer Science or equivalent industry experience.

Learn about accessibility in Apple’s workplace