Director - Applied AI ML (software Engineering/data & Agentic Systems)

JPMorgan Chase JPMorgan Chase · Banking · Bengaluru, Karnataka, India · Commercial & Investment Bank

Director level role focused on engineering and scaling GenAI capabilities, specifically agentic workflows and RAG systems, within an enterprise setting. The role involves building reusable frameworks, shared components, and establishing best practices for production-grade AI solutions, with a strong emphasis on evaluation, monitoring, and governance. The position requires deep software and data engineering expertise, cloud-native experience, and the ability to translate complex technical concepts to senior stakeholders.

What you'd actually do

  1. Establish and promote a library of reusable GenAI/ML engineering assets, including reference implementations, standardized templates/SDKs, shared RAG components (ingestion, chunking, embedding, indexing, retrieval), and deployment patterns.
  2. Lead the creation of shared tools and platforms that streamline the end-to-end lifecycle for GenAI applications, including data pipelines, orchestration, evaluation, monitoring/telemetry, and release governance.
  3. Build and operationalize agentic GenAI workflows (planning/execution patterns, tool calling, state management, retries) with appropriate guardrails, permissions, and observability.
  4. Design and implement Generative AI evaluation and feedback loops (offline test suites, human review where needed, continuous evaluation, telemetry-based monitoring, regression gating in CI/CD).
  5. Advise on strategy and development across multiple GenAI products, applications, and technology portfolios—focusing on common capabilities that scale across teams rather than one-off solutions.

Skills

Required

  • 10+ years of applied experience in software engineering and/or data engineering using Python, Java, or similar languages, building production distributed systems end-to-end.
  • Hands-on experience designing and delivering GenAI systems to production, including RAG (embeddings, retrieval/indexing) and evaluation/monitoring.
  • Hands-on experience building agentic workflows (tool calling, orchestration, state, retries, guardrails) using frameworks such as LangChain/LangGraph or equivalent.
  • Strong understanding of data architecture and engineering (lakehouse/data platform concepts), including data quality, lineage/metadata, idempotent pipelines, backfills, and governance/PII controls relevant to GenAI.
  • Strong cloud-native experience on AWS, including secure deployment and operations (e.g., EKS and/or managed services), plus cost/latency management.
  • Proven ability to translate complex technical issues to senior stakeholders; excellent communication, attention to detail, and follow-through.

Nice to have

  • Bachelor’s/Master’s degree in Computer Science (or equivalent practical experience).
  • Working knowledge of PyTorch or TensorFlow (enough to partner effectively with ML practitioners).
  • Experience with ML/GenAI evaluation automation and CI/CD quality gates (beyond basic offline testing).

What the JD emphasized

  • trusted, secure, stable, and scalable GenAI capabilities
  • production-grade agentic workflows
  • RAG-based systems
  • reusable frameworks
  • shared components
  • best practices
  • accelerate delivery
  • ensure consistency
  • GenAI engineering trends
  • complex technical tradeoffs
  • clear guidance for senior stakeholders
  • measurable business impact
  • reusable GenAI/ML engineering assets
  • reference implementations
  • standardized templates/SDKs
  • shared RAG components
  • deployment patterns
  • shared tools and platforms
  • streamline the end-to-end lifecycle
  • data pipelines
  • orchestration
  • evaluation
  • monitoring/telemetry
  • release governance
  • agentic GenAI workflows
  • planning/execution patterns
  • tool calling
  • state management
  • retries
  • guardrails
  • permissions
  • observability
  • Generative AI evaluation and feedback loops
  • offline test suites
  • human review
  • continuous evaluation
  • telemetry-based monitoring
  • regression gating in CI/CD
  • strategy and development
  • multiple GenAI products, applications, and technology portfolios
  • common capabilities that scale across teams
  • one-off solutions
  • technical feasibility
  • business value
  • GenAI use cases
  • build-vs-buy decisions
  • pragmatic solution designs
  • firmwide AI/ML stakeholders
  • standards, interoperability, adoption, and reuse
  • shared frameworks
  • complex technical issues and tradeoffs
  • quality vs latency vs cost
  • evaluation design
  • governance
  • security
  • leadership
  • well-informed strategic decisions
  • Influence across business, product, and technology teams
  • senior stakeholder relationships
  • mentor engineers and practitioners
  • raise engineering and delivery standards
  • 10+ years of applied experience
  • software engineering and/or data engineering
  • Python, Java, or similar languages
  • building production distributed systems end-to-end
  • Hands-on experience designing and delivering GenAI systems to production
  • RAG (embeddings, retrieval/indexing)
  • evaluation/monitoring
  • Hands-on experience building agentic workflows
  • tool calling, orchestration, state, retries, guardrails
  • frameworks such as LangChain/LangGraph or equivalent
  • Strong understanding of data architecture and engineering
  • lakehouse/data platform concepts
  • data quality, lineage/metadata
  • idempotent pipelines, backfills
  • governance/PII controls relevant to GenAI
  • Strong cloud-native experience on AWS
  • secure deployment and operations
  • EKS and/or managed services
  • cost/latency management
  • Proven ability to translate complex technical issues to senior stakeholders
  • excellent communication, attention to detail, and follow-through
  • Working knowledge of PyTorch or TensorFlow
  • partner effectively with ML practitioners
  • Experience with ML/GenAI evaluation automation
  • CI/CD quality gates

Other signals

  • building production-grade agentic workflows
  • RAG-based systems
  • reusable frameworks
  • shared components
  • best practices
  • GenAI engineering trends
  • evaluation and feedback loops
  • CI/CD quality gates