Data Domain Architect Lead

JPMorgan Chase JPMorgan Chase · Banking · Wilmington, DE +1 · Consumer & Community Banking

Lead a team of data analysts to support data annotation and labeling for machine learning models in a fintech domain. This involves developing strategies for training data optimization, identifying patterns in conversational data using NLP, and evaluating the quality of ML outputs. Requires experience in ML data annotation, data modeling, and Python.

What you'd actually do

  1. Manage and coach a team of Machine Learning Data Domain analysts to support data annotation and label data/content using annotation tools and analysis
  2. Partner with leads in Data Science, Engineering, and Analytics to develop strategies to optimize training data for machine learning models
  3. Lead efforts to identify patterns and trends in conversational data through Natural Language Processing and/or other computational linguistic approaches
  4. Collaborate with stakeholders on evaluating the quality of machine learning classification and other output
  5. Actively contribute to the team’s continuous learning mindset by bringing in new ideas and perspectives that stretch the thinking of the group

Skills

Required

  • Python
  • Data modeling
  • Data annotation
  • Machine learning
  • Natural Language Processing
  • LLMs
  • Prompt engineering

Nice to have

  • SQL
  • Teradata
  • Oracle
  • SAS
  • Scala
  • Advanced Statistics

What the JD emphasized

  • 6+ years of related experience in development of machine learning solutions
  • Familiar with industry annotation and labeling methods
  • Broad expertise in data technologies; i.e., data warehousing, data processing, data quality concepts, Business Intelligence tools and analytical tools, unstructured data, machine learning
  • Working knowledge of machine learning and artificial intelligence paradigms and libraries
  • Familiar with Large Language Models (LLMs) and prompt engineering

Other signals

  • data annotation
  • training data
  • machine learning models