What you'd actually do

Design and maintain scalable data pipelines using native AWS services (Glue, EMR, Lambda); build monitoring and error handling for data workflows; optimize performance, reliability, and cost efficiency

Develop and maintain APIs and data serving layers that productionize science models for downstream consumption; build batch and real-time inference pipelines

Build scalable feature extraction and processing frameworks for diverse data types; develop robust data quality and validation checks; create flexible schemas supporting evolving requirements

Partner with economics, data science, and software engineering teams to translate analytical requirements into production-ready solutions; participate in technical design reviews and architecture discussions

Maintain layered data systems used by economists and scientists; build automated reporting solutions; work across multiple interconnected AWS accounts with security best practices

Skills

Required

professional software engineering best practices
full software development life cycle
coding standards
software architectures
code reviews
source control management
continuous deployments
testing
operational excellence
3+ years of data engineering experience
Python
Java
Scala
NodeJS
data modeling
data warehousing
ETL pipelines
AWS Glue
AWS EMR
AWS Lambda
Redshift
S3
Kinesis
FireHose
non-relational databases
object storage
document or key-value stores
graph databases
column-family databases
Bachelor's degree or foreign equivalent in computer science, engineering, mathematics or equivalent

Nice to have

Hadoop
Hive
Spark
EMR
4+ years of full software development life cycle experience
Bachelor's degree or above in computer science

The PXT Central Science team is looking for a Data Engineer. This individual will join a team of economists and scientists to own and accelerate science and analytics in our rapid employee intelligence workstream. This suite of models identifies causal factors driving changes in employee sentiment, actions, and business outcomes.

Key job responsibilities PXTCS is looking for a data engineer with expertise in complex data environments. You will be responsible for enhancing our existing data architecture to further standardize metrics and definitions, building and testing new features, developing end-to-end data engineering solutions for complex analytical problems, and collaborating with economists, data scientists, and software engineers to translate data into actionable insights. Specific responsibilities include: • Data Pipeline Development: Design and maintain scalable data pipelines using native AWS services (Glue, EMR, Lambda); build monitoring and error handling for data workflows; optimize performance, reliability, and cost efficiency • Model Productionization & API Development: Develop and maintain APIs and data serving layers that productionize science models for downstream consumption; build batch and real-time inference pipelines • Data Integration & Quality: Build scalable feature extraction and processing frameworks for diverse data types; develop robust data quality and validation checks; create flexible schemas supporting evolving requirements • Cross-team Collaboration: Partner with economics, data science, and software engineering teams to translate analytical requirements into production-ready solutions; participate in technical design reviews and architecture discussions • Analytics & Infrastructure: Maintain layered data systems used by economists and scientists; build automated reporting solutions; work across multiple interconnected AWS accounts with security best practices

About the team The Central Science Team within Amazon’s People Experience and Technology org (PXTCS) uses economics, behavioral science, statistics, machine learning, and Generative AI to proactively identify mechanisms and process improvements which simultaneously improve Amazon and the lives, well-being, and the value of work to Amazonians. We are an interdisciplinary team, which combines the talents of science, engineering, and UX to develop and deliver solutions that measurably achieve this goal.

Basic Qualifications

Knowledge of professional software engineering & best practices for full software development life cycle, including coding standards, software architectures, code reviews, source control management, continuous deployments, testing, and operational excellence
3+ years of data engineering experience
Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
Experience with data modeling, warehousing and building ETL pipelines
Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
Bachelor's degree or foreign equivalent in computer science, engineering, mathematics or equivalent

Preferred Qualifications

Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
4+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
Bachelor's degree or above in computer science

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company’s reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.

USA, CA, San Francisco - 152,000.00 - 205,600.00 USD annually USA, VA, Arlington - 132,100.00 - 178,800.00 USD annually USA, WA, Bellevue - 132,100.00 - 178,800.00 USD annually USA, WA, Seattle - 132,100.00 - 178,800.00 USD annually

Basic Qualifications

Knowledge of professional software engineering & best practices for full software development life cycle, including coding standards, software architectures, code reviews, source control management, continuous deployments, testing, and operational excellence
3+ years of data engineering experience
Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
Experience with data modeling, warehousing and building ETL pipelines
Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
Bachelor's degree or foreign equivalent in computer science, engineering, mathematics or equivalent

Preferred Qualifications

Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
4+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
Bachelor's degree or above in computer science

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Data Engineer, Pxt Central Science

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Basic Qualifications

Preferred Qualifications

Basic Qualifications

Preferred Qualifications