Software Engineer Ii,python, Spark

JPMorgan Chase · Banking · Mumbai, Maharashtra, India · Consumer & Community Banking

Software Engineer II role focused on designing, developing, testing, and maintaining data pipelines and architectures using Python and Spark. The role involves working with Big Data stacks, cloud implementation (AWS), and utilizing enterprise-authorized AI coding assist tools to improve productivity and code quality. Responsibilities include data lifecycle management, control reviews, and performance tuning.

What you'd actually do

Supports review of controls to ensure sufficient protection of enterprise data
Advise and making custom configuration changes in one to two tools to generate a product at the business or customer request also updates logical or physical data models based on new use cases
Produce architecture and design artifacts for complex applications while being accountable for ensuring design constraints are met by software code development also gathers, analyzes, synthesizes, and develops visualizations and reporting from large, diverse data sets in service of continuous improvement of software applications and systems.
Proactively identifies hidden problems and patterns in data and uses these insights to drive improvements to coding hygiene and system architecture.
Leverages enterprise-authorized AI coding assist tools within the work environment to improve code quality, delivery speed, and productivity across complex deliverables (e.g., code generation/refactoring, unit test creation, documentation), while validating outputs through peer review, automated testing, and secure coding standards; contributes learnings and reusable patterns to improve broader team effectiveness.

Skills

Required

Formal training or certification on software engineering concepts
Experience across the data lifecycle
Spark-based Frameworks for end-to-end ETL, ELT & reporting solutions using key components like Spark SQL & Spark Streaming
Big Data stack including Spark and Python (Pandas, Spark SQL)
RDMS database, Relational, No SQL databases and Linux/UNIX
multi-threading and high volume batch processing
performance tuning on for Python and Spark along with Autosys or Control-M scheduler
Cloud implementation experience with AWS including, AWS Data Services: Proficiency in Lake formation, Glue ETL (or) EMR, S3, Glue Catalog, Athena, Kinesis (or) MSK, Airflow (or) Lambda + Step Functions + Event Bridge, Data De/Serialization: Expertise in at least 2 of the formats: Parquet, AVRO, Fixed Width, AWS Data Security: Good Understanding of security concepts such as: Lake formation, IAM, Service roles, Encryption, KMS, Secrets Manager
Hands-on experience using enterprise-authorized AI-assisted software development tools within the work environment (e.g., for coding, test creation, troubleshooting, or documentation) with demonstrated ability to critically evaluate, validate, and refine AI-generated outputs for correctness, performance, and security
Understanding of responsible AI use in engineering workflows, including data sensitivity considerations, secure handling of inputs/outputs, and adherence to resiliency and security expectations; ability to guide peers on safe and effective usage within team practices

Nice to have

Proficient in all aspects of the Software Development Life Cycle
Solid understanding of agile methodologies such as CI/CD, Applicant Resiliency, and Security
Knowledge of Java and Microservice architecture

What the JD emphasized

3+ years applied experience
Spark-based Frameworks for end-to-end ETL, ELT & reporting solutions using key components like Spark SQL & Spark Streaming
Strong hands on working experience of Big Data stack including Spark and Python (Pandas, Spark SQL)
Cloud implementation experience with AWS including, AWS Data Services: Proficiency in Lake formation, Glue ETL (or) EMR, S3, Glue Catalog, Athena, Kinesis (or) MSK, Airflow (or) Lambda + Step Functions + Event Bridge, Data De/Serialization: Expertise in at least 2 of the formats: Parquet, AVRO, Fixed Width, AWS Data Security: Good Understanding of security concepts such as: Lake formation, IAM, Service roles, Encryption, KMS, Secrets Manager
Hands-on experience using enterprise-authorized AI-assisted software development tools within the work environment (e.g., for coding, test creation, troubleshooting, or documentation) with demonstrated ability to critically evaluate, validate, and refine AI-generated outputs for correctness, performance, and security
Understanding of responsible AI use in engineering workflows, including data sensitivity considerations, secure handling of inputs/outputs, and adherence to resiliency and security expectations; ability to guide peers on safe and effective usage within team practices

Read full job description

We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.

As a Software Engineer at JPMorgan Chase within the JPM - US Wealth Management , you serve as a seasoned member of an agile team to design and deliver trusted data collection, storage, access, and analytics solutions in a secure, stable, and scalable way. You are responsible for developing, testing, and maintaining critical data pipelines and architectures across multiple technical areas within various business functions in support of the firm’s business objectives.

Job responsibilities

Supports review of controls to ensure sufficient protection of enterprise data
Advise and making custom configuration changes in one to two tools to generate a product at the business or customer request also updates logical or physical data models based on new use cases
Frequently uses SQL and understands NoSQL databases and their niche in the marketplace
Produce architecture and design artifacts for complex applications while being accountable for ensuring design constraints are met by software code development also gathers, analyzes, synthesizes, and develops visualizations and reporting from large, diverse data sets in service of continuous improvement of software applications and systems.
Proactively identifies hidden problems and patterns in data and uses these insights to drive improvements to coding hygiene and system architecture.
Leverages enterprise-authorized AI coding assist tools within the work environment to improve code quality, delivery speed, and productivity across complex deliverables (e.g., code generation/refactoring, unit test creation, documentation), while validating outputs through peer review, automated testing, and secure coding standards; contributes learnings and reusable patterns to improve broader team effectiveness.
Applies knowledge of tools within the Software Development Life Cycle toolchain, including enterprise-authorized AI-assisted development and automation capabilities, to improve the value realized by automation

Required qualifications, capabilities, and skills

Formal training or certification on software engineering concepts and 3+ years applied experience
Experience across the data lifecycle spark-based Frameworks for end-to-end ETL, ELT & reporting solutions using key components like Spark SQL & Spark Streaming.
Strong hands on working experience of Big Data stack including Spark and Python (Pandas, Spark SQL).
Good understanding on RDMS database, Relational, No SQL databases and Linux/UNIX.
Strong knowledge of multi-threading and high volume batch processing.
Should be good in performance tuning on for Python and Spark along with Autosys or Control-M scheduler.
Cloud implementation experience with AWS including, AWS Data Services: Proficiency in Lake formation, Glue ETL (or) EMR, S3, Glue Catalog, Athena, Kinesis (or) MSK, Airflow (or) Lambda + Step Functions + Event Bridge, Data De/Serialization: Expertise in at least 2 of the formats: Parquet, AVRO, Fixed Width, AWS Data Security: Good Understanding of security concepts such as: Lake formation, IAM, Service roles, Encryption, KMS, Secrets Manager.
Hands-on experience using enterprise-authorized AI-assisted software development tools within the work environment (e.g., for coding, test creation, troubleshooting, or documentation) with demonstrated ability to critically evaluate, validate, and refine AI-generated outputs for correctness, performance, and security.
Understanding of responsible AI use in engineering workflows, including data sensitivity considerations, secure handling of inputs/outputs, and adherence to resiliency and security expectations; ability to guide peers on safe and effective usage within team practices

Preferred qualifications, capabilities, and skills

Proficient in all aspects of the Software Development Life Cycle.
Solid understanding of agile methodologies such as CI/CD, Applicant Resiliency, and Security.
Knowledge of Java and Microservice architecture

We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.

Job responsibilities

Supports review of controls to ensure sufficient protection of enterprise data
Advise and making custom configuration changes in one to two tools to generate a product at the business or customer request also updates logical or physical data models based on new use cases
Frequently uses SQL and understands NoSQL databases and their niche in the marketplace
Produce architecture and design artifacts for complex applications while being accountable for ensuring design constraints are met by software code development also gathers, analyzes, synthesizes, and develops visualizations and reporting from large, diverse data sets in service of continuous improvement of software applications and systems.
Proactively identifies hidden problems and patterns in data and uses these insights to drive improvements to coding hygiene and system architecture.
Leverages enterprise-authorized AI coding assist tools within the work environment to improve code quality, delivery speed, and productivity across complex deliverables (e.g., code generation/refactoring, unit test creation, documentation), while validating outputs through peer review, automated testing, and secure coding standards; contributes learnings and reusable patterns to improve broader team effectiveness.
Applies knowledge of tools within the Software Development Life Cycle toolchain, including enterprise-authorized AI-assisted development and automation capabilities, to improve the value realized by automation

Required qualifications, capabilities, and skills

Formal training or certification on software engineering concepts and 3+ years applied experience
Experience across the data lifecycle spark-based Frameworks for end-to-end ETL, ELT & reporting solutions using key components like Spark SQL & Spark Streaming.
Strong hands on working experience of Big Data stack including Spark and Python (Pandas, Spark SQL).
Good understanding on RDMS database, Relational, No SQL databases and Linux/UNIX.
Strong knowledge of multi-threading and high volume batch processing.
Should be good in performance tuning on for Python and Spark along with Autosys or Control-M scheduler.
Cloud implementation experience with AWS including, AWS Data Services: Proficiency in Lake formation, Glue ETL (or) EMR, S3, Glue Catalog, Athena, Kinesis (or) MSK, Airflow (or) Lambda + Step Functions + Event Bridge, Data De/Serialization: Expertise in at least 2 of the formats: Parquet, AVRO, Fixed Width, AWS Data Security: Good Understanding of security concepts such as: Lake formation, IAM, Service roles, Encryption, KMS, Secrets Manager.
Hands-on experience using enterprise-authorized AI-assisted software development tools within the work environment (e.g., for coding, test creation, troubleshooting, or documentation) with demonstrated ability to critically evaluate, validate, and refine AI-generated outputs for correctness, performance, and security.
Understanding of responsible AI use in engineering workflows, including data sensitivity considerations, secure handling of inputs/outputs, and adherence to resiliency and security expectations; ability to guide peers on safe and effective usage within team practices

Preferred qualifications, capabilities, and skills

Proficient in all aspects of the Software Development Life Cycle.
Solid understanding of agile methodologies such as CI/CD, Applicant Resiliency, and Security.
Knowledge of Java and Microservice architecture