What you'd actually do

Perform data exploration and analysis to assess distributions, data quality issues, leakage risks, missingness, bias, and anomalies, and define data readiness criteria.

Conduct applied research to evaluate modeling approaches (classical machine learning, deep learning, and generative AI where relevant), and document findings, trade-offs, and recommendations.

Build baseline models and iteratively improve performance through feature engineering, error analysis, and interpretability techniques.

Design and deploy generative AI applications, including fine-tuning, Retrieval-Augmented Generation systems, and agentic AI frameworks.

Build and maintain automated machine learning workflows for training, evaluation, packaging, deployment, and monitoring with a focus on reliability and reproducibility.

Skills

Required

Python for data science and modeling
PyTorch, TensorFlow, PyTorch Lightning, or scikit-learn
AWS cloud development
Natural Language Processing (NLP)
Large Language Models (LLMs)
Prompt Engineering
Embeddings
Retrieval Patterns
API development (FastAPI)
Containerized ML service deployment (Docker, Kubernetes, ECS, EKS)
AWS services (S3, IAM, CloudWatch, ECS, SageMaker, Bedrock)
Infrastructure-as-code (Terraform)
Data exploration and validation (PySpark, pandas, Dask)

Nice to have

Delivering AI/ML solutions in a highly regulated environment
AWS certification
LLM evaluation methods (quality, safety, guardrails, reliability)
Model serving patterns
Distributed compute platforms (EMR, Databricks)

Join a team building secure, scalable, and reliable machine learning solutions that support critical business outcomes. You will work across the full lifecycle—from exploratory analysis and model development to deployment, monitoring, and continuous improvement. This role blends hands-on applied machine learning with strong engineering practices to deliver production-grade AI systems.

As a Data Scientist Lead – Vice President in the Chief Technology Office, you deliver end-to-end AI and machine learning solutions that are secure, stable, and scalable. You conduct applied research, build and improve models, and design production-grade workflows for deployment and monitoring. You collaborate closely with engineers and stakeholders to define integration patterns, testing strategies, and reliability standards. You support delivery in regulated environments through strong documentation and operational readiness practices.

Job Responsibilities

Perform data exploration and analysis to assess distributions, data quality issues, leakage risks, missingness, bias, and anomalies, and define data readiness criteria.
Conduct applied research to evaluate modeling approaches (classical machine learning, deep learning, and generative AI where relevant), and document findings, trade-offs, and recommendations.
Build baseline models and iteratively improve performance through feature engineering, error analysis, and interpretability techniques.
Design and deploy generative AI applications, including fine-tuning, Retrieval-Augmented Generation systems, and agentic AI frameworks.
Build and maintain automated machine learning workflows for training, evaluation, packaging, deployment, and monitoring with a focus on reliability and reproducibility.
Apply infrastructure-as-code practices to provision and manage AWS resources for AI and machine learning workloads.
Collaborate with engineers to define deployment and integration patterns (batch, real-time, event-driven) and testing strategies.
Design and implement testing strategies (unit, component, integration, end-to-end, performance, and champion/challenger where appropriate).
Mentor team members on coding practices, AI and machine learning best practices, and maintainable implementation patterns.
Contribute to design reviews, operational readiness reviews, and documentation to raise overall engineering quality.
Support delivery in regulated environments by participating in documentation, reviews, and audit readiness activities.

Required Qualifications, Capabilities, and Skills

Bachelor’s or Master’s degree in Computer Science, Data Science, Machine Learning, or a related field with 7+ years of relevant experience.
Hands-on experience with data exploration and data validation (leakage, bias, missingness, outliers, and data quality) using frameworks such as PySpark, pandas, or Dask.
Proficiency in Python for data science and modeling with production-quality coding practices and comprehensive testing.
Proficiency with machine learning frameworks such as PyTorch, TensorFlow, PyTorch Lightning, or scikit-learn.
Proficiency with cloud-based development on AWS.
Experience applying natural language processing and large language model techniques such as prompt engineering, embeddings, and retrieval patterns.
Experience building APIs (for example, FastAPI).
Experience packaging and deploying containerized machine learning services (Docker; Kubernetes, ECS, or EKS).
Experience operating on AWS services such as S3, IAM, CloudWatch, ECS, and SageMaker and/or Bedrock.
Exposure to infrastructure-as-code tooling such as Terraform.

Preferred Qualifications, Capabilities, and Skills

Experience delivering AI and machine learning solutions in a highly regulated environment.
AWS certification.
Knowledge of large language model evaluation methods, including quality, safety, guardrails, and reliability testing approaches.
Familiarity with model serving patterns and operating models in production (deployment, observability, and support).
Working knowledge of distributed compute platforms such as EMR or Databricks using PySpark for large-scale processing.