Finance Decision Optimization - Data Scientist Lead

JPMorgan Chase JPMorgan Chase · Banking · Columbus, OH +1 · Consumer & Community Banking

Lead Data Scientist focused on building and deploying agentic AI systems within a finance decision optimization group. This role involves architecting foundational agentic workflows, including tool/function calling, multi-step reasoning, and orchestration, while establishing technical standards for production deployment. Responsibilities include designing retrieval layers (RAG), defining agent success metrics, building evaluation harnesses, and implementing guardrails and human-in-the-loop checkpoints for safety and auditability. The role also requires building scalable data pipelines and predictive models using big data technologies, leading backtesting and validation, and staying current with AI trends.

What you'd actually do

  1. Architect and build foundational agentic workflows from the ground up — including tool/function calling, multi-step reasoning chains, and agent orchestration patterns — while establishing early technical standards that will scale from PoC to production-ready systems.
  2. Define success metrics specific to agent performance (task completion, tool-use accuracy, reasoning consistency, failure modes); build evaluation harnesses early in the PoC stage to validate agent behavior, surface edge cases, and establish quality baselines before scaling.
  3. Design and prototype retrieval layers (RAG, tool-augmented memory, knowledge base integrations) that agents rely on to take actions; ensure data quality and access controls are considered from day one of the PoC to avoid rearchitecting later and identify and mitigate risks unique to autonomous agents (unintended actions, prompt injection, cascading tool-call failures, data leakage) and establish guardrails and human-in-the-loop checkpoints early in the PoC to build a safe and auditable agent framework.
  4. Build, compile, and automate scalable data pipelines, complex predictive models, and optimization routines using big data technologies (Spark, Databricks, Snowflake) on cloud platforms; transform massive volumes of data into actionable business insights and package solutions into repeatable, executable workflows for QA testing and production deployment.
  5. Lead solution backtesting exercises across key stakeholder domains (e.g., Fair Lending), validate model performance against historical data, identify analytical gaps and proactively surface critical issues to business and technology partners to ensure models are robust, reliable, and decision-ready.

Skills

Required

  • 5 years of relevant professional experience as a software engineer, data/ML engineer, data scientist, or AI/ML systems engineer
  • Bachelor's degree in Computer Science, Financial Engineering, MIS, Mathematics, Statistics, or another quantitative field
  • Practical knowledge of the banking sector, specifically in areas of retail deposits, auto, card, and mortgage lending
  • Understanding of relevant compliance and regulatory contexts (e.g., Fair Lending)
  • Working knowledge of LLMs, agentic AI frameworks, and emerging AI engineering practices, including tool/function calling, RAG architectures, prompt design, and agent orchestration patterns
  • Exceptional analytical and problem-solving abilities
  • Capable of translating complex technical concepts to a wide range of audiences
  • Highly detail-oriented
  • Proven track record of delivering tasks on schedule
  • Able to manage multiple priorities efficiently
  • Excellent team player
  • Strong interpersonal skills
  • Able to work cross-functionally using a consultative approach
  • Mentor junior staff
  • Contribute to a culture of shared technical ownership and continuous improvement
  • Instrument agent workflows with observability (action traces, decision logs, cost and latency tracking) from the earliest prototype
  • Synthesize PoC findings into architectural decisions, runbooks, and optimization strategies (caching, model routing, token budgets) that accelerate the path to production deployment.

Nice to have

  • Proficiency in Python programming with a strong grasp of object-oriented and functional programming concepts
  • Experience applying Python in data processing, ML model development, and AI/LLM application development including prompt engineering and agentic workflow orchestration
  • Hands-on experience with LLM orchestration frameworks (e.g., LangChain, LangGraph, LlamaIndex, or similar)
  • Familiarity with embedding models, vector databases (e.g., FAISS, Pinecone, pgvector), retrieval-augmented generation (RAG) pipelines, and evaluation frameworks for agentic systems
  • Extensive knowledge of Apache Spark with experience optimizing Spark jobs for performance and scalability within Databricks
  • Hands-on experience with cloud platforms (AWS EC2, EMR, S3/EFS or equivalent)
  • Proficiency with Snowflake for large-scale data processing and analytics
  • Advanced SQL skills

What the JD emphasized

  • demonstrated track record of delivering complex, end-to-end technical solutions in production or near-production environments
  • understanding of relevant compliance and regulatory contexts (e.g., Fair Lending)
  • Working knowledge of LLMs, agentic AI frameworks, and emerging AI engineering practices, including tool/function calling, RAG architectures, prompt design, and agent orchestration patterns
  • Instrument agent workflows with observability (action traces, decision logs, cost and latency tracking) from the earliest prototype and synthesize PoC findings into architectural decisions, runbooks, and optimization strategies (caching, model routing, token budgets) that accelerate the path to production deployment.

Other signals

  • building agentic workflows
  • tool/function calling
  • multi-step reasoning
  • agent orchestration
  • retrieval layers (RAG)
  • evaluating agent performance
  • guardrails for agents