Lead Software Engineer - Databricks/pyspark/ai

JPMorgan Chase JPMorgan Chase · Banking · Wilmington, DE +1 · Corporate Sector

Lead Software Engineer responsible for building and optimizing data pipelines, retrieval systems, and tool integrations to power agentic AI systems. Focuses on data infrastructure, production code, and mentoring junior engineers.

What you'd actually do

  1. Building and optimizing data pipelines and workflows that serve as the backbone for agentic AI systems, ensuring agents have reliable, real-time access to high-quality, structured and unstructured data
  2. Developing data retrieval and indexing layers that enable AI agents to autonomously search, query, and synthesize information across multiple data sources
  3. Building and maintaining tool-use infrastructure — APIs, data services, and function endpoints — that AI agents invoke to execute tasks, retrieve data, and interact with enterprise systems
  4. Implementing and enforcing best practices for data management, ensuring data quality, security, and compliance, including governance of data consumed and generated by autonomous AI agents
  5. Hands-on development of secure, high-quality production code following AWS best practices, and deploying efficiently using CI/CD pipelines

Skills

Required

  • Python/PySpark
  • Databricks
  • AWS
  • Spark
  • SQL
  • Lakehouse/Delta Lake architecture
  • CI/CD
  • automated testing frameworks
  • agile methodologies
  • DevOps practices
  • API development
  • data services development
  • retrieval systems development

Nice to have

  • agentic AI frameworks (e.g., LangGraph, AutoGen, CrewAI, OpenAI Assistants API)
  • tool-use and function-calling patterns for LLM-based agents
  • vector databases (e.g., Pinecone, FAISS, Chroma)
  • embedding workflows
  • RAG
  • agent memory and state management patterns
  • guardrails and safety frameworks for autonomous AI systems
  • observability and monitoring for agentic systems
  • responsible AI principles

What the JD emphasized

  • production-quality code
  • production-grade code
  • production-ready agentic AI solutions
  • production code

Other signals

  • building agentic AI systems
  • data infrastructure for AI agents
  • tool integrations for AI agents
  • retrieval systems for AI agents