What you'd actually do

Design, build, and maintain scalable data pipelines that support multiple ML model training and inference workflows

Develop and optimize ETL processes to ingest, transform, and prepare terabytes of data from diverse sources for model consumption

Implement robust data quality checks and monitoring systems to ensure data integrity across all pipelines

Build and maintain infrastructure for model training pipelines, including feature engineering, data versioning, and experiment tracking

Design and implement scalable inference pipelines that serve predictions for millions of transactions with low latency and high reliability

Skills

Required

3+ years of data engineering experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using ETL/ELT processes experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using OLAP technologies experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using data modeling experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using SQL experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using Oracle experience
Experience with data modeling, warehousing and building ETL pipelines

Nice to have

Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)

Other signals

building scalable data infrastructure and pipelines

process terabytes of data

enabling state-of-the-art algorithms

own end-to-end data systems

directly impact the team's ability to deliver insights and models

manage the safety of millions of transactions

scaling up our operations with automation

empower scientists to develop advanced machine learning systems

productionize ML models

translating research code into production-ready systems

low latency and high reliability

Do you want to join an innovative team of scientists and engineers who use machine learning and artificial intelligence to help Amazon provide the best customer experience by preventing eCommerce fraud? Are you excited by the prospect of building scalable data infrastructure and pipelines that process terabytes of data, enabling state-of-the-art algorithms to solve real world problems? Do you like to own end-to-end data systems and directly impact the team's ability to deliver insights and models that drive company profitability? Do you enjoy collaborating in a diverse team environment?

If yes, then you may be a great fit to join the Amazon Selling Partner Trust & Store Integrity Science Team. We are looking for a talented data engineer who is passionate about building robust data platforms and pipelines that empower scientists to develop advanced machine learning systems, helping manage the safety of millions of transactions every day and scaling up our operations with automation.

Key job responsibilities DATA INFRASTRUCTURE & PIPELINE DEVELOPMENT

Design, build, and maintain scalable data pipelines that support multiple ML model training and inference workflows
Develop and optimize ETL processes to ingest, transform, and prepare terabytes of data from diverse sources for model consumption
Implement robust data quality checks and monitoring systems to ensure data integrity across all pipelines

ML OPERATIONS SUPPORT

Build and maintain infrastructure for model training pipelines, including feature engineering, data versioning, and experiment tracking
Design and implement scalable inference pipelines that serve predictions for millions of transactions with low latency and high reliability
Collaborate with scientists to productionize ML models, translating research code into production-ready systems

SYSTEM PERFORMANCE & RELIABILITY

Optimize data processing workflows for cost efficiency and performance, managing compute and storage resources effectively
Implement monitoring, alerting, and logging systems to ensure pipeline reliability and quick issue resolution
Maintain comprehensive documentation of data schemas, pipeline architectures, and operational procedures

CROSS-FUNCTIONAL COLLABORATION

Partner closely with scientists to understand data requirements and translate them into technical solutions
Work with stakeholders to define data SLAs and ensure systems meet business needs
Provide technical guidance on data architecture decisions and best practices

Basic Qualifications

3+ years of data engineering experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using ETL/ELT processes experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using OLAP technologies experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using data modeling experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using SQL experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using Oracle experience
Experience with data modeling, warehousing and building ETL pipelines

Preferred Qualifications

Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.

USA, WA, Seattle - 132,100.00 - 178,800.00 USD annually

Key job responsibilities DATA INFRASTRUCTURE & PIPELINE DEVELOPMENT

Design, build, and maintain scalable data pipelines that support multiple ML model training and inference workflows
Develop and optimize ETL processes to ingest, transform, and prepare terabytes of data from diverse sources for model consumption
Implement robust data quality checks and monitoring systems to ensure data integrity across all pipelines

ML OPERATIONS SUPPORT

Build and maintain infrastructure for model training pipelines, including feature engineering, data versioning, and experiment tracking
Design and implement scalable inference pipelines that serve predictions for millions of transactions with low latency and high reliability
Collaborate with scientists to productionize ML models, translating research code into production-ready systems

SYSTEM PERFORMANCE & RELIABILITY

Optimize data processing workflows for cost efficiency and performance, managing compute and storage resources effectively
Implement monitoring, alerting, and logging systems to ensure pipeline reliability and quick issue resolution
Maintain comprehensive documentation of data schemas, pipeline architectures, and operational procedures

CROSS-FUNCTIONAL COLLABORATION

Partner closely with scientists to understand data requirements and translate them into technical solutions
Work with stakeholders to define data SLAs and ensure systems meet business needs
Provide technical guidance on data architecture decisions and best practices

Basic Qualifications

3+ years of data engineering experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using ETL/ELT processes experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using OLAP technologies experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using data modeling experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using SQL experience
1+ years of developing and operating large-scale data structures for business intelligence analytics using Oracle experience
Experience with data modeling, warehousing and building ETL pipelines

Preferred Qualifications

Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

USA, WA, Seattle - 132,100.00 - 178,800.00 USD annually

Data Engineer, Seller Partner Trust and Store Integrity Science

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Basic Qualifications

Preferred Qualifications

Basic Qualifications

Preferred Qualifications