Lead Software Engineer - Java/python, Aws, Spark

JPMorgan Chase JPMorgan Chase · Banking · Pune, Maharashtra, India · Consumer & Community Banking

Lead Software Engineer with expertise in designing, developing, and maintaining scalable cloud-based data processing pipelines and infrastructure. The role focuses on architecting data models, partnering with cross-functional teams, and driving data quality and governance within a large enterprise, with a strong emphasis on Java/Python, AWS, Spark, and related data engineering technologies.

What you'd actually do

  1. Lead the design, development, and maintenance of robust, scalable cloud-based data processing pipelines and infrastructure, ensuring adherence to engineering standards, governance frameworks, and industry best practices.
  2. Architect and refine data models for large-scale datasets, optimizing for efficient storage, high-performance retrieval, and advanced analytics while upholding data integrity and quality.
  3. Partner with cross-functional teams to translate complex business requirements into effective, scalable data engineering solutions that drive organizational value.
  4. Champion a culture of innovation and continuous improvement, proactively identifying and implementing enhancements to data infrastructure, processing workflows, and analytics capabilities.
  5. Define and execute data strategy, including the development of enterprise data models and the management of end-to-end data infrastructure—from design and construction to installation and ongoing maintenance of large-scale processing systems.

Skills

Required

  • software engineering concepts
  • Spark
  • AWS Data lake services or Databricks
  • Airflow
  • relational and NoSQL databases
  • JSON
  • AVRO
  • Protobuf
  • Parquet
  • Iceberg
  • Java
  • Python
  • SQL
  • Scala
  • Apache Spark
  • microservices architecture
  • serverless computing
  • Docker
  • Kubernetes
  • Dimensional data modeling
  • Data Vault data modeling
  • Kimball data modeling
  • Inmon data modeling
  • test-driven development (TDD)
  • behavior-driven development (BDD)
  • continuous integration and continuous deployment (CI/CD)
  • design patterns
  • Kafka
  • MQ

Nice to have

  • Terraform
  • AWS CloudFormation
  • Snowflake
  • budgeting and resource allocation
  • vendor relationship management

What the JD emphasized

  • adherence to engineering standards
  • governance frameworks
  • regulatory requirements