Data Scientist - Vice President

JPMorgan Chase JPMorgan Chase · Banking · Mc Lean, VA +1 · Corporate Sector

Lead Data Scientist at JPMorgan Chase within Cybersecurity and Technology Controls, responsible for developing and deploying Machine Learning solutions for security use cases like fraud and threat detection. The role involves data analysis, model development, feature engineering, model governance, and producing secure production code, with a focus on Generative AI, transformer architectures, and responsible AI practices.

What you'd actually do

  1. Works with stakeholders and business leaders to understand security needs and recommend business modifications during periods of vulnerability.
  2. Work with cybersecurity engineers and data engineers to acquire data that addresses each use case (fraud, anomaly detection, Cyber threats).
  3. Perform Exploratory Data Analysis on datasets and communicate results to stakeholders.
  4. Select statistical or Deep Learning models that are best positioned to achieve business results.
  5. Perform feature engineering or hyperparameter tuning to optimize model performance.

Skills

Required

  • Formal training or certification on security engineering concepts and 5+ years applied experience
  • Advanced in one or more programming languages
  • Advanced understanding of agile methodologies such as CI/CD, Application Resiliency, and Security
  • Working knowledge of probability, statistics and statistical distributions and their applicability to use cases and the ability to perform Exploratory Data Analysis using Jupyter or SageMaker Notebooks
  • Proficient in Pandas, SQL and Data Visualization tools such as Matplotlib, Seaborn or Plotly
  • Working knowledge of Scikit-Learn for development of classification, regression and clustering models and Deep Learning frameworks such as PyTorch
  • Experience with feature engineering complex datasets
  • Possess the ability to explain model selection, model interpretability and performance metrics verbally and in writing

Nice to have

  • Experience with Knowledge Graphs, graph analytics and graph databases a plus
  • Working knowledge of Large Language Models (LLM), NLP, Embedding Models and Vector Databases
  • Experience with Retrieval Augmented Generation (RAG) applications and the frameworks used to create them such as Langchain or Llamaindex
  • Experience with AI Agent frameworks such as Google ADK and Langraph
  • Experience deploying Statistical or Machine Learning models via AWS SageMaker in a production setting
  • Working knowledge of Responsible AI, model fairness, and reliability and safety

What the JD emphasized

  • must have a working knowledge
  • working knowledge of Generative AI models
  • working knowledge of probability, statistics
  • Working knowledge of Scikit-Learn
  • Working knowledge of Large Language Models (LLM)
  • Working knowledge of Responsible AI

Other signals

  • develop Machine Learning solutions
  • detection and prevention of misuse, circumvention, and malicious behavior
  • develops secure and high-quality production code