Software Engineer III - Data Engineer, Databricks

JPMorgan Chase JPMorgan Chase · Banking · Bengaluru, Karnataka, India · Asset & Wealth Management

Software Engineer III - Data Engineer at JPMorgan Chase, focusing on designing, building, and maintaining batch and streaming data pipelines using Databricks, PySpark, and Spark SQL. Responsibilities include ETL/ELT workflows, data modeling, performance tuning, ensuring data quality, and supporting production operations within a fintech domain.

What you'd actually do

  1. Designs, build, and maintain batch and (as needed) streaming data pipelines using Databricks.
  2. Develops and optimize ETL/ELT workflows using PySpark / Spark SQL and Databricks workflows/jobs.
  3. Implements data modeling (bronze/silver/gold patterns), curation, and dataset publishing for analytics and consumption.
  4. Tunes and optimize Spark jobs for performance, cost, and scalability (partitioning, file sizing, caching, joins, etc.).
  5. Ensures strong data quality through validations, reconciliations, monitoring, and alerting.

Skills

Required

  • Data Engineering
  • Databricks
  • Python
  • SQL
  • PySpark/Spark SQL
  • Data modeling
  • ETL/ELT
  • performance tuning
  • data quality
  • monitoring
  • troubleshooting
  • data pipeline architecture
  • orchestration concepts
  • dependency management
  • data lakes/lakehouse
  • Git-based workflows

Nice to have

  • AI/ML exposure
  • MLflow
  • Databricks model registry
  • Delta Lake
  • streaming
  • event-driven patterns
  • cloud platforms
  • data governance
  • orchestration tools
  • production-grade data platforms