Sr Lead Software Engineer - Cloud / ML / Genai

JPMorgan Chase JPMorgan Chase · Banking · Plano, TX +1 · Corporate Sector

Senior Lead Software Engineer focused on ML and GenAI solutions within Public Cloud Engineering at JPMorgan Chase. The role involves leading architecture, development, and production deployment of ML/LLM solutions, applying modern GenAI workflows, and ensuring responsible AI practices. Responsibilities include designing and implementing end-to-end ML/LLM solutions, productionizing models on public clouds using Kubernetes, establishing evaluation methodologies, and collaborating with stakeholders.

What you'd actually do

  1. Design and implement end-to-end ML and LLM solutions, from problem framing and data preparation through training, evaluation, deployment, and ongoing optimization.
  2. Apply modern GenAI workflows, including prompt engineering techniques, tracing, evaluations, guardrails, and safety frameworks to align model behavior with business objectives and risk controls.
  3. Productionize high-quality models and pipelines on public clouds, leveraging Kubernetes for container orchestration where appropriate.
  4. Establish robust offline and online evaluation methodologies, including intrinsic and extrinsic metrics (e.g., relevance, safety, latency, cost efficiency), and integrate automated testing/monitoring.
  5. Collaborate closely with product, platform, security, controls, and business stakeholders across a geographically distributed organization; provide technical mentorship and code reviews.

Skills

Required

  • software engineering concepts
  • Python
  • Java
  • GenAI/LLMs
  • prompt engineering
  • tracing
  • evaluations
  • guardrails
  • NLP
  • Generative AI
  • ML
  • deep learning
  • large language models
  • ML/DL toolkits and libraries
  • Transformers
  • Hugging Face
  • TensorFlow
  • PyTorch
  • NumPy
  • scikit-learn
  • pandas
  • AI/ML and GenAI solutions
  • training frameworks
  • metrics aligned to business goals
  • public cloud (AWS, GCP, or Azure)
  • containerization/orchestration
  • Docker
  • Kubernetes
  • data structures
  • algorithms
  • data mining
  • information retrieval
  • statistics
  • communication skills

Nice to have

  • Natural Language Processing
  • Reinforcement Learning
  • Ranking/Recommendation
  • Time Series Analysis
  • PyTorch
  • Keras
  • MXNet
  • scikit-learn
  • financial services or wealth management domains
  • Contributions to open-source ML/LLM tooling
  • certifications in AWS, Azure, GCP, or Kubernetes

What the JD emphasized

  • production deployment
  • responsible AI methods
  • evaluations
  • guardrails
  • safety frameworks
  • responsible AI considerations

Other signals

  • leading hands-on architecture, development, and production deployment of ML and LLM-powered solutions
  • apply strong engineering practices, rigorous experimentation, and responsible AI methods to deliver high-impact capabilities
  • production deployment of ML and LLM-powered solutions