Software Development Engineer Ii, Aws Data Platform

Amazon Amazon · Big Tech · Seattle, WA · Software Development

Software Development Engineer II on the AWS Data Platform team, responsible for building and maintaining data infrastructure, processing workflows, and tooling at massive scale. The role involves ingesting, transforming, cataloging, governing, and consuming data, with an increasing focus on leveraging generative AI and semantic layer technologies for data discoverability and natural-language access.

What you'd actually do

  1. Build and maintain data infrastructure using software engineering best practices, data management fundamentals, data storage principles, and operational excellence standards — creating datasets that analysts, scientists, and AI systems use to generate actionable insights.
  2. Develop automation and tooling that improves the reliability, scalability, and efficiency of data processing workflows across EMR, Spark, Redshift, and ingestion services.
  3. Design and implement data storage and compute solutions that balance cost, performance, and availability using distributed systems principles and open table formats (Hudi/Iceberg) to handle the ever-growing volume of AWS data.
  4. Own your services end to end: participate in on-call rotations, debug production issues, and continually reduce operational burden through directed engineering investments — fewer SEV2s, fewer manual interventions, more automation.
  5. Collaborate with business owners and internal stakeholders to understand data requirements and translate them into scalable, low-cost data flows from production systems into the data platform.

Skills

Required

  • Software engineering best practices
  • Data management fundamentals
  • Data storage principles
  • Distributed systems principles
  • Operational excellence standards
  • Data ingestion
  • Data transformation
  • Data cataloging
  • Data governance
  • Data consumption
  • EMR
  • Spark
  • Redshift
  • Apache Iceberg
  • Hudi

Nice to have

  • Generative AI
  • Semantic layer technologies
  • Natural-language access to datasets
  • AI-powered recommendations

Other signals

  • Leveraging generative AI and semantic layer technologies
  • enabling natural-language access to datasets
  • AI-powered recommendations
  • AI agents