Data Engineer Ii, Intelligence Data Services, Amazon Leo

Amazon Amazon · Big Tech · Bellevue, WA · Data Science

Data Engineer II role focused on building and maintaining an enterprise-grade data platform using AWS services, combining data lake and mesh architectures. Responsibilities include data modeling, ETL pipeline development, data governance, and supporting analytics and machine learning use cases for the Amazon Leo satellite broadband network.

What you'd actually do

  1. Design, implement and maintain data infrastructure including data modeling, ETL pipelines, and ongoing maintenance.
  2. Partner with product, operational, and technical teams to build data pipelines from a wide variety of sources using AWS big data technologies (Lake Formation, Glue, S3, MWAA, Lambda, etc.).
  3. Build a data dictionary, catalog and governance plan and manage and audit all of the registered data via robust mechanisms.
  4. Work with data consumers to provide and correlate the right data for business intelligence and machine learning use-cases.
  5. Develop automated solutions to minimize manual processes with focus on efficiency and scalability.

Skills

Required

  • 3+ years of data engineering experience
  • Experience with data modeling, warehousing and building ETL pipelines
  • Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS

Nice to have

  • Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
  • Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
  • Experience in automating, deploying, and supporting infrastructure
  • Experience building or managing data infrastructure, using Infrastructure as Code (CDK or Terraform)?
  • Experience with data governance and data catalog frameworks

What the JD emphasized

  • enterprise grade data platform
  • data lake and mesh architectures
  • AWS data services
  • data modeling
  • ETL pipelines
  • data pipelines
  • data dictionary
  • data catalog
  • governance plan
  • data stewardship
  • data and analytics infrastructure
  • data governance
  • tagging and access controls
  • data discoverability
  • data quality monitoring
  • automated tagging
  • GenAI-powered tools
  • custom data integrations