Data Scientist [multiple Positions Available]

JPMorgan Chase JPMorgan Chase · Banking · Plano, TX +1 · Commercial & Investment Bank

Data Scientist at JPMorgan Chase responsible for designing, building, and implementing end-to-end data products, including product recommendation models for core banking products. The role involves creating actionable insights, collaborating across teams, and deploying scalable AI/ML models in a large-scale enterprise environment.

What you'd actually do

  1. Drive the design, build, and implementation of end-to-end data products that systemically publish actionable insights to commercial bankers and their clients.
  2. Create product recommendation models for core banking products utilizing internal and external data to inform modeling approaches, produce prototypes, and implement into production.
  3. Perform analytics exploring the relationship of companies, products, and banker engagement to inform the product roadmap and top-down decision-making.
  4. Drive collaboration across information architecture, data engineering, applied science, business intelligence, product management, data architecture, software engineering, and business analytics to bring products to market and architect data-driven solutions.
  5. Guide discovery of new data science solutions, collaborating directly with business and tech leaders across banking, product management, engineering, and applied science.

Skills

Required

  • Creating and maintaining models consisting of banking, financial, and external data sources
  • Building insights ecosystems to standardize disparate analytics into a single business-facing feed of insights
  • Designing and implementing scalable AI/ML models in large-scale enterprise environments using Databricks and MLflow
  • Developing end-to-end pipelines, managing model versioning, and enabling automated deployment and monitoring for production readiness
  • Defining business requirements and translating them into technical solutions
  • Optimizing model training and inference workflows for efficiency and scalability
  • Applying multivariate statistical methods including principal component analysis (PCA), linear discriminant analysis (LDA), and cluster analysis to uncover data patterns, reduce dimensionality, and inform feature engineering
  • Performing data preprocessing, outlier detection, and variable selection
  • Conducting exploratory data analysis to identify relationships and trends in high-dimensional datasets
  • Implementing feature selection and transformation techniques to improve model performance
  • Developing quantitative models to extract insights from enterprise datasets, including time series analysis, predictive modeling, and risk assessment
  • Validating model assumptions and assessing statistical significance of results
  • Communicating findings and recommendations to stakeholders through visualizations and reports
  • Building supervised machine learning models using techniques including regression, classification, and experimental design with model validation, hyperparameter tuning, and performance assessment using accuracy, precision, recall, and ROC-AUC metrics
  • Designing and executing controlled experiments using A/B testing to evaluate model impact
  • Implementing cross-validation and regularization methods to prevent overfitting and improve generalizability
  • Developing AI and ML models in a large-scale enterprise using clustering, decision trees, random forest, support vector machines, ensemble methods, boosting, neural networks, TensorFlow, and NLTK
  • Driving end-to-end project management using reproducible runs to deploy models into production

What the JD emphasized

  • scalable AI/ML models in large-scale enterprise environments
  • automated deployment and monitoring for production readiness
  • optimize model training and inference workflows

Other signals

  • product recommendation models
  • scalable AI/ML models in large-scale enterprise environments
  • automated deployment and monitoring for production readiness
  • optimize model training and inference workflows