Big Data Software Engineer - Python at JPMorgan Chase

What you'd actually do

Acquire and manage data from primary and secondary data sources

Identify, analyze, and interpret trends or patterns in complex data sets

Transform existing ETL logic on AWS and Databricks

Innovate new ways of managing, transforming and validating data

Implement new or enhance services and scripts (in both object-oriented and functional programming)

Skills

Required

advanced Python programming
Pandas
NumPy
Spark
Kafka
Databricks
ETL transformations
AWS services (EC2, EMR, ASG, Lambda, EKS, RDS)
API development
SQL queries
linear algebra
statistics
algorithms
UNIX shell scripting
data quality testing
relational database environment (Oracle, SQL Server)
analytical skills
attention to detail
accuracy
development discipline
best practices and standards

Nice to have

Data Science
Machine Learning
AI
Financial Services
Commercial banking
NoSQL platforms (MongoDB, AWS Open Search)

What the JD emphasized

extensive experience in utilizing libraries such as Pandas and NumPy

Experience in code and infrastructure for Big Data technologies (e.g. Spark, Kafka, Databricks etc.) and implementing complex ETL transformations

Experience with AWS services including EC2, EMR, ASG, Lambda, EKS, RDS and others

Strong understanding of linear algebra, statistics, and algorithms

Strong Experience with UNIX shell scripting to automate file preparation and database loads

Experience in data quality testing; adept at writing test cases and scripts, presenting and resolving data issues

As a Software Engineer III at JPMorgan Chase within the Commercial & Investment Bank, you serve as a seasoned member of an agile team to design and deliver trusted market-leading technology products in a secure, stable, and scalable way. You are responsible for carrying out critical technology solutions across multiple technical areas within various business functions in support of the firm’s business objectives.

The JPMorgan Chase Commercial & Investment Bank is undertaking a strategic, initiative called Client 360 aimed at developing a big data platform and Firmwide solution for Entity Resolution and Relationships. We are seeking a Big Data Software Engineer with skills and experience implementing large-scale, cloud platform processing internal and 3rd party data. This individual will work on groundbreaking work to implement new solutions for Client 360 - Entity Resolution and Relationships and enhance the existing platform.

Job responsibilities

Acquire and manage data from primary and secondary data sources
Identify, analyze, and interpret trends or patterns in complex data sets
Transform existing ETL logic on AWS and Databricks
Innovate new ways of managing, transforming and validating data
Implement new or enhance services and scripts (in both object-oriented and functional programming)
Establish and enforce guidelines to ensure consistency, quality and completeness of data assets
Apply quality assurance best practices to all work products
Analyze, design and implement business-related solutions and core architectural changes using Agile programming methodologies with a development team
Become comfortable with learning cutting edge technology stacks and applications to greenfield projects

Qualifications

Proficiency in advanced Python programming, with extensive experience in utilizing libraries such as Pandas and NumPy.
Experience in code and infrastructure for Big Data technologies (e.g. Spark, Kafka, Databricks etc.) and implementing complex ETL transformations
Experience with AWS services including EC2, EMR, ASG, Lambda, EKS, RDS and others
Experience developing APIs leveraging different back-end data stores (RDS, Graph, Dynamo, etc.)
Experience in writing efficient SQL queries
Strong understanding of linear algebra, statistics, and algorithms.
Strong Experience with UNIX shell scripting to automate file preparation and database loads
Experience in data quality testing; adept at writing test cases and scripts, presenting and resolving data issues
Familiarity with relational database environment (Oracle, SQL Server, etc.) leveraging databases, tables/views, stored procedures, agent jobs, etc.
Strong analytical skills with the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy
Strong development discipline and adherence to best practices and standards.

Preferred qualifications, capabilities and skills

Experience in Data Science, Machine Learning and AI is a plus
Financial Services and Commercial banking experience is a plus
Familiarity with NoSQL platforms (MongoDB, AWS Open Search) is a plus

Job responsibilities

Acquire and manage data from primary and secondary data sources
Identify, analyze, and interpret trends or patterns in complex data sets
Transform existing ETL logic on AWS and Databricks
Innovate new ways of managing, transforming and validating data
Implement new or enhance services and scripts (in both object-oriented and functional programming)
Establish and enforce guidelines to ensure consistency, quality and completeness of data assets
Apply quality assurance best practices to all work products
Analyze, design and implement business-related solutions and core architectural changes using Agile programming methodologies with a development team
Become comfortable with learning cutting edge technology stacks and applications to greenfield projects

Qualifications

Proficiency in advanced Python programming, with extensive experience in utilizing libraries such as Pandas and NumPy.
Experience in code and infrastructure for Big Data technologies (e.g. Spark, Kafka, Databricks etc.) and implementing complex ETL transformations
Experience with AWS services including EC2, EMR, ASG, Lambda, EKS, RDS and others
Experience developing APIs leveraging different back-end data stores (RDS, Graph, Dynamo, etc.)
Experience in writing efficient SQL queries
Strong understanding of linear algebra, statistics, and algorithms.
Strong Experience with UNIX shell scripting to automate file preparation and database loads
Experience in data quality testing; adept at writing test cases and scripts, presenting and resolving data issues
Familiarity with relational database environment (Oracle, SQL Server, etc.) leveraging databases, tables/views, stored procedures, agent jobs, etc.
Strong analytical skills with the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy
Strong development discipline and adherence to best practices and standards.

Preferred qualifications, capabilities and skills

Experience in Data Science, Machine Learning and AI is a plus
Financial Services and Commercial banking experience is a plus
Familiarity with NoSQL platforms (MongoDB, AWS Open Search) is a plus