Data Engineer Ii, Professional Services Strategy & Operation, Analytics & Insights Center (paic)

Amazon Amazon · Big Tech · Seattle, WA · Software Development

Data Engineer II role at AWS Professional Services focused on building and scaling data products, including AI/ML and generative AI tools, for enterprise customers. Responsibilities include developing ETL pipelines, data models, and data integration using SQL, Python, and AWS services, with a focus on scalability, automation, and platform reliability.

What you'd actually do

  1. Develop and support ETL pipelines with robust monitoring and alarming
  2. Develop data models that are optimized and aggregated for business needs
  3. Develop and optimize data tables using best practices for partitioning, compression, parallelization, etc.
  4. Build robust and scalable data integration (ETL) pipelines using SQL, Python, and AWS services such as Glue, Lambda, and Step Functions
  5. Implement data structures using best practices in data modeling, ETL/ELT processes, and SQL/Redshift

Skills

Required

  • 3+ years of developing and operating large-scale data structures for business intelligence analytics using ETL/ELT processes experience
  • Bachelor's degree in Computer Science, Engineering, a related field, or equivalent experience
  • Experience with SQL and Python scripting
  • Experience in data applications using large scale distributed systems (e.g., EMR, Spark, Elasticsearch, Hadoop, Pig, and Hive)

Nice to have

  • Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
  • Experience with database, data warehouse or data lake solutions
  • Experience with version control systems and CI/CD pipeline implementation
  • Experience with non-relational databases / data stores (object storage, document

What the JD emphasized

  • advanced analytical products including AI/ML and generative AI tools
  • scaling our existing infrastructure
  • building robust data pipelines
  • technical leader owning the architecture of our data platform
  • build and maintain data pipelines, optimize queries, and ensure platform reliability