Data Engineer, Denied Party Screening, Aws Compliance & Security Assurance

Amazon Amazon · Big Tech · Seattle, WA · Business Intelligence

This role is for a Data Engineer on the Deny Party Screening (DPS) team, which focuses on preventing denied parties from transacting with Amazon businesses. The engineer will design, implement, and support scalable data infrastructure and pipelines using AWS big data technologies to integrate data from various sources, curate data for reporting, analysis, and machine learning models, and support real-time data pipelines. The role involves collaborating with other teams, researching new technologies, and developing dashboards. While the role mentions machine learning models, its core function is data engineering and infrastructure, not direct AI/ML model development or research.

What you'd actually do

  1. Design, implement, and support data warehouse/ data lake infrastructure using AWS bigdata stack, Python, Redshift, QuickSight, Glue/lake formation, EMR/Spark, Athena etc.
  2. Develop and manage ETLs to source data from various financial, AWS networking and operational systems and create unified data model for analytics and reporting.
  3. Creation and support of real-time data pipelines built on AWS technologies including EMR, Glue, Redshift/Spectrum and Athena.
  4. Collaborate with other Engineering teams, Product/Finance Managers/Analysts to implement advanced analytics algorithms that exploit our rich datasets for financial model development, statistical analysis, prediction, etc.
  5. Continual research of the latest big data and visualization technologies to provide new capabilities and increase efficiency.

Skills

Required

  • 5+ years of data engineering experience
  • Experience programming with at least one modern language such as C++, C#, Java, Python, Golang, PowerShell, Ruby
  • Experience working on and delivering end to end projects independently
  • Experience with big data technologies such as: Hadoop, Hive, Spark, EMR

Nice to have

  • Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
  • Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
  • Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
  • Experience with distributed systems as it pertains to data storage and computing
  • Experience as a data engineer or related specialty (e.g., software engineer, business intelligence engineer, data scientist) with a track record of manipulating, processing, and extracting value from large datasets
  • Experience providing technical leadership and mentoring other engineers for best practices on data engineering

What the JD emphasized

  • design, implement, and support scalable data infrastructure solutions
  • integrate with multi heterogeneous data sources
  • aggregate and retrieve data in a fast and safe mode
  • curate data that can be used in reporting, analysis, machine learning models and ad-hoc data requests
  • exposed to cutting edge AWS big data technologies
  • excellent business and communication skills
  • gather infrastructure requirements
  • build up data pipelines and datasets
  • stay abreast of emerging technologies
  • investigating and implementing where appropriate
  • Design, implement, and support data warehouse/ data lake infrastructure
  • Develop and manage ETLs
  • source data from various financial, AWS networking and operational systems
  • create unified data model for analytics and reporting
  • Creation and support of real-time data pipelines
  • Collaborate with other Engineering teams, Product/Finance Managers/Analysts
  • implement advanced analytics algorithms
  • exploit our rich datasets for financial model development, statistical analysis, prediction, etc.
  • Continual research of the latest big data and visualization technologies
  • provide new capabilities and increase efficiency
  • develop dashboards those are used by senior leadership
  • Empower technical and non-technical, internal customers to drive their own analytics and reporting
  • support ad-hoc reporting when needed
  • Working closely with team members to drive real-time model implementations for monitoring and alerting of risk systems
  • Manage numerous requests concurrently and strategically, prioritizing when necessary
  • Partner/collaborate across teams/roles to deliver results.
  • Mentor other engineers, influence positively team culture, and help grow the team
  • 5+ years of data engineering experience
  • Experience programming with at least one modern language such as C++, C#, Java, Python, Golang, PowerShell, Ruby
  • Experience working on and delivering end to end projects independently
  • Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
  • Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
  • Experience with non-relational databases / data stores
  • Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
  • Experience with distributed systems as it pertains to data storage and computing
  • Experience as a data engineer or related specialty
  • Experience providing technical leadership and mentoring other engineers for best practices on data engineering