We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.
As a Lead Software Engineer at JPMorganChase within the Marketing Automation Platforms Team, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way. As a core technical contributor, you are responsible for conducting critical technology solutions across multiple technical areas within various business functions in support of the firm’s business objectives.
Job responsibilities
- Lead, mentor, and grow a high-performing team of 5 – 7 engineers across multiple workstreams, fostering a culture of innovation, ownership, and technical excellence.
- Set the technical vision and engineering roadmap for the Data Products platform, aligning with firmwide priorities.
- Operate as a player-coach — providing hands-on architectural guidance while empowering the team to own and deliver independently.
- Drive cross-functional collaboration with platform teams, domain Data Product Owners, AI/ML teams and governance teams.
- Architect and own the end-to-end technical design of the Data Products Studio — a scalable, enterprise-grade platform that orchestrates the discovery, design, build, and productionization of data products from the CCB Data Lake and Snowflake.
- Design the platform's AI/Agentic AI layer, leveraging intent agents, NLP Text-to-SQL, Knowledge Graphs (KAG), RAG, Vector Databases, and Agent-to-Agent (A2A) communication to enable intelligent, automated data product creation and natural language interaction with the data estate.
- Establish and enforce architectural standards, design patterns, and engineering best practices across the team — ensuring scalability, security, resilience, and maintainability.
- Lead the design and development of Agentic AI capabilities that power the Data Products Framework — including autonomous discovery agents that profile and recommend data product candidates, design agents that auto-generate data contracts and schema recommendations, build agents that generate and optimize data pipelines, governance agents that auto-apply entitlements based on data classification, and quality agents that detect anomalies, drift, and trigger self-healing remediation.
- Architect the Agent-to-Agent communication layer enabling multi-agent orchestration across the data product lifecycle — from discovery through productionization.
- Leverage RAG (Retrieval Augmented Generation) and Vector Databases to enable contextual, knowledge-grounded AI interactions with metadata, lineage, and data catalog information.
- Implement NLP Text-to-SQL capabilities allowing business users to explore the CCB Data Lake and Snowflake using natural language, lowering the barrier to data product discovery.
Required qualifications, capabilities, and skills
- Proven track record of architecting and delivering large-scale, enterprise-grade data platforms or frameworks from concept through production in a large corporate environment.
- Deep hands-on expertise in Python, SQL, and at least one additional language (Java 17+, Spring, Boot), with strong system design and distributed systems knowledge.
- Extensive experience designing, building, and optimizing ETL/ELT pipelines at scale, including batch and real-time data processing.
- Strong proficiency in PySpark for distributed data processing, including DataFrame and Dataset APIs and Spark SQL.
- Experience working with UI frameworks (React, Angular).
- Extensive experience with AWS cloud services including S3, Athena, Glue, Lambda, Step Functions, IAM, KMS, and Terraform.
- Basic knowledge of Snowflake (architecture, performance optimization, Tasks, Streams, Stored Procedures, Materialized Views, security model)
- Experience designing and building AI/ML-powered platforms or applications, with working knowledge of LLMs, RAG architectures, Vector Databases, NLP, and agentic frameworks.
- Deep understanding of data governance principles including metadata management, data lineage, access control (RBAC/ABAC), data classification, and policy enforcement.
- Experience with Grafana or equivalent observability platforms for custom dashboards, APM, SLA monitoring and alerting