What you'd actually do

Own end-to-end design, development, and operation of ETL/ELT pipelines that extract, transform, and load data from diverse sources using SQL, Python, and AWS big data technologies

Manage and optimize multiple production Redshift clusters, including performance tuning, capacity planning, and cost optimization to support transportation org reporting needs

Lead technical design discussions with Product teams, Data Scientists, Software Developers, and Business Intelligence Engineers to define data infrastructure requirements and deliver scalable solutions

Define and enforce data engineering best practices for your domain, including code quality standards, testing frameworks, documentation, and deployment processes

Conduct thorough code reviews and mentor junior data engineers on technical problem-solving, coding standards, and AWS best practices

Skills

Required

3+ years of data engineering experience
Experience with distributed systems as it pertains to data storage and computing
Bachelor's degree in Computer Science, Engineering, Mathematics, or a related field
2+ years of experience writing production data pipelines using SQL and Python
Experience designing and implementing ETL/ELT solutions with large-scale data processing

Nice to have

Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with AWS services including S3, Redshift, Sagemaker, EMR, Kinesis, Lambda, and EC2
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
Experience in mentoring, leading, or managing more junior engineers
Master's degree in Engineering, Computer Science, or a related field
Experience with data orchestration frameworks such as Apache Airflow, AWS Step Functions, or Glue Workflows
Experience with infrastructure-as-code tools (CloudFormation, Terraform, or CDK)

Are you passionate about building data infrastructure that powers real-time logistics decisions for millions of customers? The Global Transportation Services (GTS) Speed team is looking for a Data Engineer II to own critical data pipeline components that drive delivery speed optimization across Amazon's transportation network.

In this role, you will independently design, build, and operate scalable data infrastructure solutions that integrate with multiple heterogeneous data sources. You will own end-to-end pipeline development — from extraction and transformation to loading and serving — ensuring data is delivered reliably and efficiently for reporting, analysis, and machine learning workloads. You will manage multiple Redshift clusters supporting the transportation organization's reporting needs, make technical decisions on data modeling and architecture for your domain, and collaborate with cross-functional teams to translate business requirements into high-impact data solutions.

Key job responsibilities

Own end-to-end design, development, and operation of ETL/ELT pipelines that extract, transform, and load data from diverse sources using SQL, Python, and AWS big data technologies
Manage and optimize multiple production Redshift clusters, including performance tuning, capacity planning, and cost optimization to support transportation org reporting needs
Lead technical design discussions with Product teams, Data Scientists, Software Developers, and Business Intelligence Engineers to define data infrastructure requirements and deliver scalable solutions
Define and enforce data engineering best practices for your domain, including code quality standards, testing frameworks, documentation, and deployment processes
Conduct thorough code reviews and mentor junior data engineers on technical problem-solving, coding standards, and AWS best practices
Proactively identify and resolve scalability bottlenecks, re-designing infrastructure for greater reliability and performance
Evaluate emerging AWS technologies and lead proof-of-concept efforts to enhance data platform capabilities
Own production operations including release management, incident response, and continuous improvement of data delivery systems

A day in the life You start your day reviewing pipeline health dashboards and resolving data quality issues. You collaborate with data scientists and business leaders to translate their requirements into technical solutions. During high-volume periods like Prime Week, you ensure pipelines support real-time decision-making for delivery network optimization. You spend time writing and reviewing code, optimizing query performance, and mentoring teammates on best practices. You also evaluate emerging technologies — building proof-of-concepts for intelligent data discovery and automated solutions that reduce manual effort and enable stakeholders to access insights independently.

About the team The GTS Speed team optimizes Amazon's delivery network performance through data-driven insights and automated solutions. Our mission is to enhance delivery speeds and improve customer promise times across the US. We develop models that analyze complex network data to support one-day and two-day delivery capabilities. During critical periods like Prime Week, our initiatives help maintain rapid delivery promises while balancing operational efficiency. The culture emphasizes collaboration, innovation, and customer impact, with team members working across organizational boundaries to solve problems that affect millions of customers daily.

Basic Qualifications

3+ years of data engineering experience
Experience with distributed systems as it pertains to data storage and computing
Bachelor's degree in Computer Science, Engineering, Mathematics, or a related field
2+ years of experience writing production data pipelines using SQL and Python
Experience designing and implementing ETL/ELT solutions with large-scale data processing

Preferred Qualifications

Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with AWS services including S3, Redshift, Sagemaker, EMR, Kinesis, Lambda, and EC2
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
Experience in mentoring, leading, or managing more junior engineers
Master's degree in Engineering, Computer Science, or a related field
Experience with data orchestration frameworks such as Apache Airflow, AWS Step Functions, or Glue Workflows
Experience with infrastructure-as-code tools (CloudFormation, Terraform, or CDK)

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.

USA, AZ, Tempe - 132,100.00 - 178,800.00 USD annually USA, WA, Bellevue - 132,100.00 - 178,800.00 USD annually

Key job responsibilities

Own end-to-end design, development, and operation of ETL/ELT pipelines that extract, transform, and load data from diverse sources using SQL, Python, and AWS big data technologies
Manage and optimize multiple production Redshift clusters, including performance tuning, capacity planning, and cost optimization to support transportation org reporting needs
Lead technical design discussions with Product teams, Data Scientists, Software Developers, and Business Intelligence Engineers to define data infrastructure requirements and deliver scalable solutions
Define and enforce data engineering best practices for your domain, including code quality standards, testing frameworks, documentation, and deployment processes
Conduct thorough code reviews and mentor junior data engineers on technical problem-solving, coding standards, and AWS best practices
Proactively identify and resolve scalability bottlenecks, re-designing infrastructure for greater reliability and performance
Evaluate emerging AWS technologies and lead proof-of-concept efforts to enhance data platform capabilities
Own production operations including release management, incident response, and continuous improvement of data delivery systems

Basic Qualifications

3+ years of data engineering experience
Experience with distributed systems as it pertains to data storage and computing
Bachelor's degree in Computer Science, Engineering, Mathematics, or a related field
2+ years of experience writing production data pipelines using SQL and Python
Experience designing and implementing ETL/ELT solutions with large-scale data processing

Preferred Qualifications

Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with AWS services including S3, Redshift, Sagemaker, EMR, Kinesis, Lambda, and EC2
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
Experience in mentoring, leading, or managing more junior engineers
Master's degree in Engineering, Computer Science, or a related field
Experience with data orchestration frameworks such as Apache Airflow, AWS Step Functions, or Glue Workflows
Experience with infrastructure-as-code tools (CloudFormation, Terraform, or CDK)

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

USA, AZ, Tempe - 132,100.00 - 178,800.00 USD annually USA, WA, Bellevue - 132,100.00 - 178,800.00 USD annually