Software Engineer Ii,python, Spark

JPMorgan Chase JPMorgan Chase · Banking · Mumbai, Maharashtra, India · Consumer & Community Banking

Software Engineer II role focused on designing, developing, testing, and maintaining data pipelines and architectures using Python and Spark. The role involves working with Big Data stacks, cloud implementation (AWS), and utilizing enterprise-authorized AI coding assist tools to improve productivity and code quality. Responsibilities include data lifecycle management, control reviews, and performance tuning.

What you'd actually do

  1. Supports review of controls to ensure sufficient protection of enterprise data
  2. Advise and making custom configuration changes in one to two tools to generate a product at the business or customer request also updates logical or physical data models based on new use cases
  3. Produce architecture and design artifacts for complex applications while being accountable for ensuring design constraints are met by software code development also gathers, analyzes, synthesizes, and develops visualizations and reporting from large, diverse data sets in service of continuous improvement of software applications and systems.
  4. Proactively identifies hidden problems and patterns in data and uses these insights to drive improvements to coding hygiene and system architecture.
  5. Leverages enterprise-authorized AI coding assist tools within the work environment to improve code quality, delivery speed, and productivity across complex deliverables (e.g., code generation/refactoring, unit test creation, documentation), while validating outputs through peer review, automated testing, and secure coding standards; contributes learnings and reusable patterns to improve broader team effectiveness.

Skills

Required

  • Formal training or certification on software engineering concepts
  • Experience across the data lifecycle
  • Spark-based Frameworks for end-to-end ETL, ELT & reporting solutions using key components like Spark SQL & Spark Streaming
  • Big Data stack including Spark and Python (Pandas, Spark SQL)
  • RDMS database, Relational, No SQL databases and Linux/UNIX
  • multi-threading and high volume batch processing
  • performance tuning on for Python and Spark along with Autosys or Control-M scheduler
  • Cloud implementation experience with AWS including, AWS Data Services: Proficiency in Lake formation, Glue ETL (or) EMR, S3, Glue Catalog, Athena, Kinesis (or) MSK, Airflow (or) Lambda + Step Functions + Event Bridge, Data De/Serialization: Expertise in at least 2 of the formats: Parquet, AVRO, Fixed Width, AWS Data Security: Good Understanding of security concepts such as: Lake formation, IAM, Service roles, Encryption, KMS, Secrets Manager
  • Hands-on experience using enterprise-authorized AI-assisted software development tools within the work environment (e.g., for coding, test creation, troubleshooting, or documentation) with demonstrated ability to critically evaluate, validate, and refine AI-generated outputs for correctness, performance, and security
  • Understanding of responsible AI use in engineering workflows, including data sensitivity considerations, secure handling of inputs/outputs, and adherence to resiliency and security expectations; ability to guide peers on safe and effective usage within team practices

Nice to have

  • Proficient in all aspects of the Software Development Life Cycle
  • Solid understanding of agile methodologies such as CI/CD, Applicant Resiliency, and Security
  • Knowledge of Java and Microservice architecture

What the JD emphasized

  • 3+ years applied experience
  • Spark-based Frameworks for end-to-end ETL, ELT & reporting solutions using key components like Spark SQL & Spark Streaming
  • Strong hands on working experience of Big Data stack including Spark and Python (Pandas, Spark SQL)
  • Cloud implementation experience with AWS including, AWS Data Services: Proficiency in Lake formation, Glue ETL (or) EMR, S3, Glue Catalog, Athena, Kinesis (or) MSK, Airflow (or) Lambda + Step Functions + Event Bridge, Data De/Serialization: Expertise in at least 2 of the formats: Parquet, AVRO, Fixed Width, AWS Data Security: Good Understanding of security concepts such as: Lake formation, IAM, Service roles, Encryption, KMS, Secrets Manager
  • Hands-on experience using enterprise-authorized AI-assisted software development tools within the work environment (e.g., for coding, test creation, troubleshooting, or documentation) with demonstrated ability to critically evaluate, validate, and refine AI-generated outputs for correctness, performance, and security
  • Understanding of responsible AI use in engineering workflows, including data sensitivity considerations, secure handling of inputs/outputs, and adherence to resiliency and security expectations; ability to guide peers on safe and effective usage within team practices