What you'd actually do

Build and operate the real-time inference service that scores models for the risk decision engine, with low latency and high availability as first-class requirements

Own model deployment infrastructure — registry and versioning, CI/CD with performance, bias, and consistency checks, shadow mode, and staged rollouts

Build model observability: availability, latency, and error monitoring, plus drift detection as a retraining trigger

Partner with Risk Data Science to take models from a clean development-to-production handoff through to production operation under MLP ownership

Implement experimentation capabilities such as champion/challenger and canary routing, and explainability outputs like SHAP attributions

Skills

Required

Python
API frameworks (FastAPI or Flask)
model deployment and lifecycle tooling
model registries
CI/CD for models
versioning
staged rollout patterns
observability and alerting for production services
SQL
key-value/low-latency stores (Redis, DynamoDB, or equivalent)
streaming pipelines (Kafka, Kinesis, Redpanda, or equivalent)

Nice to have

modern data stack (Snowflake, dbt, Dagster, Airflow, or similar)
regulated, audit-sensitive, or compliance-adjacent environment
functional languages
Haskell
React
TypeScript

Mercury's use of machine learning in risk decisioning is growing fast in scope and in stakes. Models increasingly drive real-time decisions about fraud and financial crime, and the Machine Learning Platform (MLP) team exists to build a paved path from a trained model to a reliable production deployment, speeding up iteration, and ensuring granular production observability.

MLP owns the production ML lifecycle: the systems that take a model from registry through deployment, real-time inference, observability, and retraining. Our Data Science colleagues author and train the models; we build the platform that lets them register, deploy, and observe those models in production without carrying the operational burden themselves — and we serve low-latency, highly available scores to the decision engine that depends on them. The platform supports business decisioning broadly, with our first use cases focused on fraud risk outcomes.

At Mercury, we are committed to crafting an exceptional banking* experience for startups. Our team is passionately focused on ensuring our products create a safe environment that meets the needs of our customers, administrators, and regulators.

* Mercury is a fintech company, not an FDIC-insured bank. Banking services provided through Choice Financial Group and Column N.A., Members FDIC.

As part of this role, you will:

Build and operate the real-time inference service that scores models for the risk decision engine, with low latency and high availability as first-class requirements
Own model deployment infrastructure — registry and versioning, CI/CD with performance, bias, and consistency checks, shadow mode, and staged rollouts
Build model observability: availability, latency, and error monitoring, plus drift detection as a retraining trigger
Partner with Risk Data Science to take models from a clean development-to-production handoff through to production operation under MLP ownership
Implement experimentation capabilities such as champion/challenger and canary routing, and explainability outputs like SHAP attributions
Feel a strong sense of product ownership and actively seek responsibility — we self-organize on small and medium projects, and we want someone excited to help shape and build a brand-new platform team

The ideal candidate for the role has:

5+ years in machine learning engineering, backend software engineering, MLOps, or a closely related field
Production ML service experience — deploying, serving, and operating models in low-latency, high-availability contexts
Strong backend engineering fundamentals in Python, with API frameworks like FastAPI or Flask
Experience with model deployment and lifecycle tooling: model registries, CI/CD for models, versioning, and staged rollout patterns (shadow, canary, champion/challenger)
Experience building observability and alerting for production services — latency, errors, and ideally model-specific signals like drift
Comfort with the data layer ML depends on: SQL, key-value/low-latency stores (Redis, DynamoDB, or equivalent), and streaming pipelines (Kafka, Kinesis, Redpanda, or equivalent)

Nice to have:

Familiarity with a modern data stack (Snowflake, dbt, Dagster, Airflow, or similar)
Experience operating in a regulated, audit-sensitive, or compliance-adjacent environment
Exposure to functional languages or willingness to work across a stack that includes Haskell, React, and TypeScript

The total rewards package at Mercury includes base salary, equity, and benefits. Our salary and equity ranges are highly competitive within the SaaS and fintech industry and are updated regularly using the most reliable compensation survey data for our industry. New hire offers are made based on a candidate’s experience, expertise, geographic location, and internal pay equity relative to peers.

Our target new hire base salary ranges for this role are the following:

US employees (any location): $166,600 - $208,300
Canadian employees (any location): CAD 157,400 - 196,800

Mercury values diversity & belonging and is proud to be an Equal Employment Opportunity employer. All individuals seeking employment at Mercury are considered without regard to race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, veteran status, gender identity, sexual orientation, or any other legally protected characteristic. We are committed to providing reasonable accommodations throughout the recruitment process for applicants with disabilities or special needs. If you need assistance, or an accommodation, please let your recruiter know once you are contacted about a role.

#LI-RA1