What you'd actually do

Design, implement and maintain data infrastructure including data modeling, ETL pipelines, and ongoing maintenance.

Partner with product, operational, and technical teams to build data pipelines from a wide variety of sources using AWS big data technologies (Lake Formation, Glue, S3, MWAA, Lambda, etc.).

Build a data dictionary, catalog and governance plan and manage and audit all of the registered data via robust mechanisms.

Work with data consumers to provide and correlate the right data for business intelligence and machine learning use-cases.

Develop automated solutions to minimize manual processes with focus on efficiency and scalability.

Skills

Required

3+ years of data engineering experience
Experience with data modeling, warehousing and building ETL pipelines
Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS

Nice to have

Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
Experience in automating, deploying, and supporting infrastructure
Experience building or managing data infrastructure, using Infrastructure as Code (CDK or Terraform)?
Experience with data governance and data catalog frameworks

What the JD emphasized

enterprise grade data platform

data lake and mesh architectures

AWS data services

data modeling

ETL pipelines

data pipelines

data dictionary

data catalog

governance plan

data stewardship

data and analytics infrastructure

data governance

tagging and access controls

data discoverability

data quality monitoring

automated tagging

GenAI-powered tools

custom data integrations

Amazon Leo is Amazon’s low Earth orbit satellite broadband network. Its mission is to deliver fast, reliable internet to customers and communities around the world, and we’ve designed the system with the capacity, flexibility, and performance to serve a wide range of customers, from individual households to schools, hospitals, businesses, government agencies, and other organizations operating in locations without reliable connectivity.

Export Control : This position requires that the candidate selected be a U.S. Citizen in order to comply with U.S. government-imposed requirements related to the nature of the work and/or where it will be performed.

Leo Intelligence Data Services (KIDS) team is seeking a Data Engineer II who will help architect and build enterprise grade data platform, combining data lake and mesh architectures using latest AWS data services. In this role, you'll be responsible for building and maintaining the organization's central data platform that serves as the single source of truth, enabling teams across the company to develop analytics and dashboards. Working closely with cross-functional teams across hardware, software, supply chain, manufacturing, launch, facilities, finance, compliance, and HR, you'll implement sophisticated data architectures and self-service tooling to support various analytics use cases including business reporting, production pipelines, optimization models, statistical analysis, and simulations.

As a Data Engineer II, you will be responsible for architecture and engineering of data infrastructure across our systems of supply chain, development, production and test. You'll drive production optimization by managing data stores, developing key performance indicators, and enabling data-driven program decisions. The role focuses on architecting and implementing data ingestion and data vending processes that provide crucial insights into our business health, ultimately serving as the foundation for our organization's data-driven future. Strong expertise in AWS data services and experience with enterprise-scale data architectures is essential for success in this position.

The ideal candidate is a technically savvy engineer maintaining strong ownership, detail-oriented analytical thinker, communicates effectively with both technical and business teams and thrives in a fast-paced agile environment. They combine strong technical expertise with business acumen, can manage multiple priorities, and adapts quickly to changing requirements and priorities.

Key job responsibilities

Design, implement and maintain data infrastructure including data modeling, ETL pipelines, and ongoing maintenance.
Partner with product, operational, and technical teams to build data pipelines from a wide variety of sources using AWS big data technologies (Lake Formation, Glue, S3, MWAA, Lambda, etc.).
Build a data dictionary, catalog and governance plan and manage and audit all of the registered data via robust mechanisms.
Work with data consumers to provide and correlate the right data for business intelligence and machine learning use-cases.
Develop automated solutions to minimize manual processes with focus on efficiency and scalability.
Provide guidance on data stewardship for teams onboarding to the central data systems.

A day in the life As a Data Engineer II, you'll start your day collaborating with cross-functional partners to understand their data needs. You might spend time optimizing ETL pipelines, reviewing code with team members, designing new data models, building new self-service frameworks for downstream consumers, or enhancing our data infrastructure for security, scalability and reliability. Yow will work with partner teams on building golden datasets. You'll participate in technical discussions to solve complex problems.

About the team Amazon Leo Intelligence Data Services (KIDS) team owns DataHive, a centralized data and analytics infrastructure that serves as the single source of truth, enabling teams to develop insights and analytics tools that drive Amazon Leo's satellite production operations. The platform enforces stringent data governance with tagging and access controls, enhancing data discoverability and security. We're building the next generation of self-service capabilities for both data providers and consumers, including data catalogs, quality monitoring, automated tagging, GenAI-powered tools, and custom data integrations that make data accessible and actionable for all Amazon Leo teams.

Basic Qualifications

3+ years of data engineering experience
Experience with data modeling, warehousing and building ETL pipelines
Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS

Preferred Qualifications

Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
Experience in automating, deploying, and supporting infrastructure
Experience building or managing data infrastructure, using Infrastructure as Code (CDK or Terraform)?
Experience with data governance and data catalog frameworks

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.

USA, WA, Bellevue - 132,100.00 - 178,800.00 USD annually

Key job responsibilities

Design, implement and maintain data infrastructure including data modeling, ETL pipelines, and ongoing maintenance.
Partner with product, operational, and technical teams to build data pipelines from a wide variety of sources using AWS big data technologies (Lake Formation, Glue, S3, MWAA, Lambda, etc.).
Build a data dictionary, catalog and governance plan and manage and audit all of the registered data via robust mechanisms.
Work with data consumers to provide and correlate the right data for business intelligence and machine learning use-cases.
Develop automated solutions to minimize manual processes with focus on efficiency and scalability.
Provide guidance on data stewardship for teams onboarding to the central data systems.

Basic Qualifications

3+ years of data engineering experience
Experience with data modeling, warehousing and building ETL pipelines
Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS

Preferred Qualifications

Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
Experience in automating, deploying, and supporting infrastructure
Experience building or managing data infrastructure, using Infrastructure as Code (CDK or Terraform)?
Experience with data governance and data catalog frameworks

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

USA, WA, Bellevue - 132,100.00 - 178,800.00 USD annually