Senior Data Scientist - Shopping Experience (search)

Instacart · Consumer · United States · Remote · Data Science

Senior Data Scientist on the Shopping Experience team at Instacart, focusing on Search. This role owns the analytics and experimentation strategy for search relevance, ranking quality, and latency, partnering with Product, Engineering, and ML. Responsibilities include defining metrics, designing and analyzing A/B tests, building diagnostic analyses, connecting offline model evaluation with online metrics, and improving data quality for search. The role requires strong analytical skills, product sense, and communication, with experience in A/B testing, SQL, Python/R, and modern AI tooling. Preferred qualifications include experience in search relevance, recommendations, NLP, embeddings, and bridging offline/online evaluation.

What you'd actually do

Own core Search metrics and funnels end to end (e.g., query impression engagement cart adds), including defining guardrails, monitoring performance across platforms and segments, and diagnosing conversion gaps.
Design, run, and interpret experiments across ranking, retrieval, and search UX (e.g., relevance model changes, query understanding, result layouts), turning ambiguous or conflicting outcomes into crisp, data-driven recommendations.
Partner with Product, Engineering, and ML to prioritize opportunities, size impact, and influence the roadmap for relevance, quality, and latency improvements that unlock measurable business outcomes.
Build deep diagnostic analyses by query class, price point, surface, and customer lifecycle to pinpoint where and why Search underperforms and specify concrete changes that will move key outcomes.
Connect offline model evaluation with online and business metrics by collaborating with ML partners on evaluation design, ensuring model changes reliably improve end-user experience—not just offline scores.
Improve data quality, instrumentation, and metric definitions for Search so that teams can reason about performance with clarity, consistency, and speed.

Skills

Required

Advanced SQL proficiency, including complex joins and window functions, working with large-scale datasets in modern data warehouses (e.g., Snowflake, BigQuery, Redshift).
Proficiency in Python or R for analysis, experimentation, and modeling.
Hands-on experience designing and analyzing A/B tests end to end, including metric selection, power and sample sizing, covariate adjustment, and decision-making under uncertainty.
Demonstrated ability to define success metrics, decompose ambiguous product problems, and deliver clear, opinionated recommendations to Product and Engineering partners.
Excellent written and verbal communication skills; able to tailor complex analyses for both technical and non-technical audiences.
Bachelor’s degree in a quantitative field (e.g., Statistics, Computer Science, Mathematics, Economics, Engineering) or equivalent practical experience.
Comfort using modern AI tooling (e.g., Claude, code assistants, PromptQL) to accelerate analysis, experimentation, and communication while exercising strong judgment on quality and reliability.

Nice to have

Experience in search relevance, ranking, recommendations, personalization, or information retrieval (e.g., e-commerce or marketplace search).
Familiarity with NLP, embeddings, and semantic search, including how to evaluate and iterate on these techniques in production.
Experience bridging offline evaluation metrics (e.g., NDCG, precision/recall, human evaluation) with online experiments and business outcomes.
Background in causal inference beyond standard A/B tests (e.g., holdouts, diff-in-diff, quasi-experiments) to measure long-term or cross-surface effects.
Comfort working across web and native app surfaces, navigating tradeoffs between relevance, monetization, and latency.
Proven impact improving logging, instrumentation, and metric definitions in complex data environments.

What the JD emphasized

Own core Search metrics and funnels end to end
defining guardrails
diagnosing conversion gaps
Design, run, and interpret experiments
turning ambiguous or conflicting outcomes into crisp, data-driven recommendations
Partner with Product, Engineering, and ML
influence the roadmap
unlock measurable business outcomes
Build deep diagnostic analyses
specify concrete changes
Connect offline model evaluation with online and business metrics
ensuring model changes reliably improve end-user experience
Improve data quality, instrumentation, and metric definitions
reason about performance with clarity, consistency, and speed
rigorous analytics
strong product sense
clear communication
drive decisive action
rolling up your sleeves
collaborating across disciplines
using experimentation to uncover what truly helps customers find the right items quickly
Advanced SQL proficiency
complex joins and window functions
working with large-scale datasets
modern data warehouses
Proficiency in Python or R
Hands-on experience designing and analyzing A/B tests end to end
metric selection
power and sample sizing
covariate adjustment
decision-making under uncertainty
define success metrics
decompose ambiguous product problems
deliver clear, opinionated recommendations
Excellent written and verbal communication skills
tailor complex analyses for both technical and non-technical audiences
Bachelor’s degree in a quantitative field
equivalent practical experience
Comfort using modern AI tooling
exercise strong judgment on quality and reliability
Experience in search relevance, ranking, recommendations, personalization, or information retrieval
e-commerce or marketplace search
Familiarity with NLP, embeddings, and semantic search
how to evaluate and iterate on these techniques in production
Experience bridging offline evaluation metrics
online experiments and business outcomes
Background in causal inference beyond standard A/B tests
measure long-term or cross-surface effects
Comfort working across web and native app surfaces
navigating tradeoffs between relevance, monetization, and latency
Proven impact improving logging, instrumentation, and metric definitions
complex data environments

Other signals

own the analytics and experimentation strategy that powers how we interpret customer intent and connect it to the most relevant items and retailers
shape the roadmap for search relevance, ranking quality, and latency
translate complex, noisy signals into clear insights and recommendations that move the metrics that matter—search conversion, order rate, and GTV
partner with Product, Engineering, and ML to prioritize opportunities, size impact, and influence the roadmap for relevance, quality, and latency improvements

Apply on company site

● Active

Posted 2mo ago · last seen 1w ago · 66 days open

AI score: 7/10
Stage: Ship Eval Gate
Location: United StatesRemote
Role: Senior · Applied
Function: Engineering
Domain: consumer
Team: Shopping Experience
Maturity: Scaling

Skills

Agents & Autonomy

Tool Use

Applied ML Domains

AerospaceData SciencePersonalizationRecommendation Systems

Data Engineering

BigQueryRedshiftSQLSnowflake

General Experience & Skills

Data-Driven Decision MakingProduct Impact

Infrastructure & Systems

Cloud Native

LLM & Foundation Models

AI Copilot / Coding AssistantEmbeddingsPrompt Engineering

Languages

Python

ML Ops & Evaluation

A/B TestingProduction ML Systems

Math & Foundations

Causal InferenceStatistics

NLP & Language

Information RetrievalNatural Language Processing

Research & Credentials

Applied Mathematics

Retrieval & Search

Ranking & RelevanceSearch EnginesSemantic Search

Read full job description

We're transforming the grocery industry

At Instacart, we invite the world to share love through food because we believe everyone should have access to the food they love and more time to enjoy it together. Where others see a simple need for grocery delivery, we see exciting complexity and endless opportunity to serve the varied needs of our community. We work to deliver an essential service that customers rely on to get their groceries and household goods, while also offering safe and flexible earnings opportunities to Instacart Personal Shoppers.

Instacart has become a lifeline for millions of people, and we’re building the team to help push our shopping cart forward. If you’re ready to do the best work of your life, come join our table.

**Instacart is a Flex First team **

There’s no one-size fits all approach to how we do our best work. Our employees have the flexibility to choose where they do their best work—whether it’s from home, an office, or your favorite coffee shop—while staying connected and building community through regular in-person events. Learn more about our flexible approach to where we work.

Overview

Instacart’s Shopping Experience team is focused on making it fast and effortless for customers to find the right items within a single retailer and complete their order with confidence. As a Senior Data Scientist dedicated to Search, you’ll own the analytics and experimentation strategy that powers how we interpret customer intent and connect it to the most relevant items and retailers.

In this role, you’ll partner closely with Product, Engineering, and Machine Learning to shape the roadmap for search relevance, ranking quality, and latency. Your work will translate complex, noisy signals into clear insights and recommendations that move the metrics that matter—search conversion, order rate, and GTV—while also strengthening downstream experiences like ads and retailer satisfaction.

About the Job

Own core Search metrics and funnels end to end (e.g., query → impression → engagement → cart adds), including defining guardrails, monitoring performance across platforms and segments, and diagnosing conversion gaps.
Design, run, and interpret experiments across ranking, retrieval, and search UX (e.g., relevance model changes, query understanding, result layouts), turning ambiguous or conflicting outcomes into crisp, data-driven recommendations.
Partner with Product, Engineering, and ML to prioritize opportunities, size impact, and influence the roadmap for relevance, quality, and latency improvements that unlock measurable business outcomes.
Build deep diagnostic analyses by query class, price point, surface, and customer lifecycle to pinpoint where and why Search underperforms and specify concrete changes that will move key outcomes.
Connect offline model evaluation with online and business metrics by collaborating with ML partners on evaluation design, ensuring model changes reliably improve end-user experience—not just offline scores.
Improve data quality, instrumentation, and metric definitions for Search so that teams can reason about performance with clarity, consistency, and speed.

About You

You combine rigorous analytics, strong product sense, and clear communication to drive decisive action in a fast-paced environment. You enjoy rolling up your sleeves, collaborating across disciplines, and using experimentation to uncover what truly helps customers find the right items quickly.

Minimum Qualifications

5+ years of experience in data science or product analytics, with a track record of impact on consumer-facing products.
Advanced SQL proficiency, including complex joins and window functions, working with large-scale datasets in modern data warehouses (e.g., Snowflake, BigQuery, Redshift).
Proficiency in Python or R for analysis, experimentation, and modeling.
Hands-on experience designing and analyzing A/B tests end to end, including metric selection, power and sample sizing, covariate adjustment, and decision-making under uncertainty.
Demonstrated ability to define success metrics, decompose ambiguous product problems, and deliver clear, opinionated recommendations to Product and Engineering partners.
Excellent written and verbal communication skills; able to tailor complex analyses for both technical and non-technical audiences.
Bachelor’s degree in a quantitative field (e.g., Statistics, Computer Science, Mathematics, Economics, Engineering) or equivalent practical experience.
Comfort using modern AI tooling (e.g., Claude, code assistants, PromptQL) to accelerate analysis, experimentation, and communication while exercising strong judgment on quality and reliability.

Preferred Qualifications

Experience in search relevance, ranking, recommendations, personalization, or information retrieval (e.g., e-commerce or marketplace search).
Familiarity with NLP, embeddings, and semantic search, including how to evaluate and iterate on these techniques in production.
Experience bridging offline evaluation metrics (e.g., NDCG, precision/recall, human evaluation) with online experiments and business outcomes.
Background in causal inference beyond standard A/B tests (e.g., holdouts, diff-in-diff, quasi-experiments) to measure long-term or cross-surface effects.
Comfort working across web and native app surfaces, navigating tradeoffs between relevance, monetization, and latency.
Proven impact improving logging, instrumentation, and metric definitions in complex data environments.

#LI-Remote

Instacart provides highly market-competitive compensation and benefits in each location where our employees work. This role is remote and the base pay range for a successful candidate is dependent on their permanent work location. Please review our Flex First remote work policy here.

Offers may vary based on many factors, such as candidate experience and skills required for the role. Additionally, this role is eligible for a new hire equity grant as well as annual refresh grants. Please read more about our benefits offerings here.

For US based candidates, the base pay ranges for a successful candidate are listed below.

CA, NY, CT, NJ

$194,000—$204,500 USD

$185,000—$195,500 USD

OR, DE, ME, MA, MD, NH, RI, VT, DC, PA, VA, CO, TX, IL, HI

$177,000—$187,000 USD

All other states

$161,000—$170,000 USD