Position Summary...

As a Principal Data Scientist at Walmart, you will define and execute the data science roadmap for the experimentation platform that powers trusted decision-making across Walmart’s A/B testing ecosystem. This is a hands-on technical leadership role at the intersection of experimentation science, large-scale data systems, and AI evaluation. You will own the scientific direction behind experiment reporting, dashboards, guardrails, and reusable measurement services, ensuring experiment exposure data is stitched to business and operational outcomes with rigor, scalability, and clarity. You will partner closely with engineering, product, and business teams to modernize our statistical tooling, improve self-service experimentation, and extend our measurement framework to emerging AI use cases including LLM evals, prompt evaluation, hybrid human/LLM judging, and offline-to-online quality measurement. We are looking for a self-starter who can move fluidly from strategy to hands-on prototyping, quickly validating ideas through lightweight automated workflows and proofs of concept.

What you'll do...

Role summary

About the team

Our team owns and manages Walmart’s experimentation platform, enabling A/B testing across multiple channels and regions. We build and maintain the scalable infrastructure, data foundations, and measurement systems required to support high experiment volume with reliable and accurate outcomes. One of the team’s core responsibilities is generating experiment reports and dashboards that translate raw experiment data into trusted business insights. To do this, we own a broad set of ETL processes that generate, transform, and stitch experiment exposure data with business and operational metrics. We also develop and maintain the statistical processes and guardrails that underpin sound decision-making, including sample imbalance checks, metric validation, and analysis standards. As experimentation expands into AI-powered experiences, the team is evolving the platform to support LLM evals, prompt evaluation, and new approaches to measuring quality, customer impact, and business value.

What you’ll do

Define the multi-year data science roadmap for experimentation reporting, dashboards, and measurement services, identifying the highest-leverage investments in methodology, automation, and self-service.
Lead the design of scalable statistical frameworks for online experiments across product, business, and operational use cases, including guardrails, heterogeneity analysis, sequential decisioning, variance reduction, and quasi-experimental methods when randomized tests are not feasible.
Partner with data engineering to design robust SQL and PySpark data models, pipelines, and observability standards that improve correctness, speed, and reusability of experimentation data assets.
Establish and govern canonical experiment metrics, scorecards, and reporting standards across channels, regions, and surfaces.
Define the strategy for AI-native experimentation and evaluation, including LLM eval frameworks, prompt evaluation, golden datasets, rubric design, human-in-the-loop review, LLM-as-a-judge calibration, and ongoing regression monitoring.
Build lightweight proofs of concept and small automated workflows using tools such as Python, SQL, Airflow, and Google Cloud Platform technologies to validate ideas before broader platform investment.
Serve as the senior technical advisor to leaders across product, engineering, and business on experimental design, causal interpretation, metric tradeoffs, and measurement risk.

What you’ll bring

Deep expertise in experimentation, causal inference, and statistical decision-making, with a track record of shaping how organizations design, analyze, and operationalize experiments at scale.
Expert-level SQL and PySpark, strong Python skills, and hands-on experience working with high-volume, distributed data pipelines in production environments.
Experience building or materially improving experimentation platforms, measurement systems, or internal science tooling rather than only delivering one-off analyses.
Strong understanding of metric design, guardrails, data quality, and observability for experimentation systems, including sample ratio mismatch, exposure correctness, and downstream metric integrity.
Self-starter mindset, with the ability to work through ambiguity, define a roadmap, and independently drive ideas from concept to execution.
Experience in e-commerce, retail, marketplace, logistics, last-mile delivery, or other high-scale consumer platforms with complex operational feedback loops.
Working knowledge of modern AI evaluation methods, including LLM evals, prompt experimentation, model or prompt regression testing, and hybrid human-plus-automated quality frameworks.
Ability to translate ambiguous business problems into rigorous analysis plans, technical designs, and executive-ready recommendations.

Minimum qualifications

Bachelor’s degree in Statistics, Economics, Analytics, Mathematics, Computer Science, Information Technology, Operations Research, or related field and 10 years’ experience in data science, experimentation, measurement science, or related field.

OR Master’s degree in one of the above fields and 8 years’ relevant experience.

OR PhD in one of the above fields and 6 years’ relevant experience. In all cases, candidates should demonstrate strong hands-on experience with SQL, Spark/PySpark, experimentation, and causal inference at production scale.

Preferred qualifications

Experience building or scaling experimentation platforms, internal measurement tooling, or self-service analytics capabilities.
Experience supporting high-volume A/B testing in e-commerce, marketplace, or last-mile environments.
Deep knowledge of advanced experimentation methods such as CUPED/CUPAC, switchback designs, cluster randomization, interference and network effects, Bayesian or sequential testing, and observational causal inference.
Experience defining AI evaluation frameworks for conversational AI, search, recommendation, or other LLM-powered products.
Experience with Google Cloud Platform, Airflow, and modern orchestration, monitoring, and data workflow patterns.
Publications, patents, or conference contributions in experimentation, causal inference, AI evaluation, or applied machine learning.
Successful completion of one or more assessments in Python, Spark, Scala, or R

At Walmart, we offer competitive pay as well as performance-based bonus awards and other great benefits for a happier mind, body, and wallet. Health benefits include medical, vision and dental coverage. Financial benefits include 401(k), stock purchase and company-paid life insurance. Paid time off benefits include PTO (including sick leave), parental leave, family care leave, bereavement, jury duty, and voting. Other benefits include short-term and long-term disability, company discounts, Military Leave Pay, adoption and surrogacy expense reimbursement, and more. You will also receive PTO and/or PPTO that can be used for vacation, sick leave, holidays, or other purposes. The amount you receive depends on your job classification and length of employment. It will meet or exceed the requirements of paid sick leave laws, where applicable. For information about PTO, see https://one.walmart.com/notices. Live Better U is a Walmart-paid education benefit program for full-time and part-time associates in Walmart and Sam's Club facilities. Programs range from high school completion to bachelor's degrees, including English Language Learning and short-form certificates. Tuition, books, and fees are completely paid for by Walmart. Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to a specific plan or program terms. For information about benefits and eligibility, see One.Walmart. Sunnyvale, California US-11657: The annual salary range for this position is $143,000.00 - $286,000.00 Bentonville, Arkansas US-10735: The annual salary range for this position is $110,000.00 - $220,000.00 Additional compensation includes annual or quarterly performance bonuses. Additional compensation for certain positions may also include :

Stock

ㅤ

‎

Minimum Qualifications...

__Outlined below are the required minimum qualifications for this position. If none are listed, there are no minimum qualifications. __

Option 1: Bachelors degree in Statistics, Economics, Analytics, Mathematics, Computer Science, Information Technology or related field and 5 years' experience in an analytics related field. Option 2: Masters degree in Statistics, Economics, Analytics, Mathematics, Computer Science, Information Technology or related field and 3 years' experience in an analytics related field. Option 3: 7 years' experience in an analytics or related field

Preferred Qualifications...

Outlined below are the optional preferred qualifications for this position. If none are listed, there are no preferred qualifications.

Data science, machine learning, optimization models, PhD in Machine Learning, Computer Science, Information Technology, Operations Research, Statistics, Applied Mathematics, Econometrics, Publications or active peer reviewer in related journals or conference, Successful completion of one or more assessments in Python, Spark, Scala, or R, Using open source frameworks (for example, scikit learn, tensorflow, torch), We value candidates with a background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly. The ideal candidate would have knowledge of accessibility best practices and join us as we continue to create accessible products and services following Walmart’s accessibility standards and guidelines for supporting an inclusive culture.

Primary Location...

1345 Crossman Ave, Sunnyvale, CA 94089-1114, United States of America

Walmart and its subsidiaries are committed to maintaining a drug-free workplace and has a no tolerance policy regarding the use of illegal drugs and alcohol on the job. This policy applies to all employees and aims to create a safe and productive work environment.

Position Summary...

What you'll do...

Role summary

About the team

What you’ll do

Define the multi-year data science roadmap for experimentation reporting, dashboards, and measurement services, identifying the highest-leverage investments in methodology, automation, and self-service.
Lead the design of scalable statistical frameworks for online experiments across product, business, and operational use cases, including guardrails, heterogeneity analysis, sequential decisioning, variance reduction, and quasi-experimental methods when randomized tests are not feasible.
Partner with data engineering to design robust SQL and PySpark data models, pipelines, and observability standards that improve correctness, speed, and reusability of experimentation data assets.
Establish and govern canonical experiment metrics, scorecards, and reporting standards across channels, regions, and surfaces.
Define the strategy for AI-native experimentation and evaluation, including LLM eval frameworks, prompt evaluation, golden datasets, rubric design, human-in-the-loop review, LLM-as-a-judge calibration, and ongoing regression monitoring.
Build lightweight proofs of concept and small automated workflows using tools such as Python, SQL, Airflow, and Google Cloud Platform technologies to validate ideas before broader platform investment.
Serve as the senior technical advisor to leaders across product, engineering, and business on experimental design, causal interpretation, metric tradeoffs, and measurement risk.

What you’ll bring

Deep expertise in experimentation, causal inference, and statistical decision-making, with a track record of shaping how organizations design, analyze, and operationalize experiments at scale.
Expert-level SQL and PySpark, strong Python skills, and hands-on experience working with high-volume, distributed data pipelines in production environments.
Experience building or materially improving experimentation platforms, measurement systems, or internal science tooling rather than only delivering one-off analyses.
Strong understanding of metric design, guardrails, data quality, and observability for experimentation systems, including sample ratio mismatch, exposure correctness, and downstream metric integrity.
Self-starter mindset, with the ability to work through ambiguity, define a roadmap, and independently drive ideas from concept to execution.
Experience in e-commerce, retail, marketplace, logistics, last-mile delivery, or other high-scale consumer platforms with complex operational feedback loops.
Working knowledge of modern AI evaluation methods, including LLM evals, prompt experimentation, model or prompt regression testing, and hybrid human-plus-automated quality frameworks.
Ability to translate ambiguous business problems into rigorous analysis plans, technical designs, and executive-ready recommendations.

Minimum qualifications

OR Master’s degree in one of the above fields and 8 years’ relevant experience.

Preferred qualifications

Experience building or scaling experimentation platforms, internal measurement tooling, or self-service analytics capabilities.
Experience supporting high-volume A/B testing in e-commerce, marketplace, or last-mile environments.
Deep knowledge of advanced experimentation methods such as CUPED/CUPAC, switchback designs, cluster randomization, interference and network effects, Bayesian or sequential testing, and observational causal inference.
Experience defining AI evaluation frameworks for conversational AI, search, recommendation, or other LLM-powered products.
Experience with Google Cloud Platform, Airflow, and modern orchestration, monitoring, and data workflow patterns.
Publications, patents, or conference contributions in experimentation, causal inference, AI evaluation, or applied machine learning.
Successful completion of one or more assessments in Python, Spark, Scala, or R

Stock

ㅤ

‎

Minimum Qualifications...

__Outlined below are the required minimum qualifications for this position. If none are listed, there are no minimum qualifications. __

Preferred Qualifications...

Outlined below are the optional preferred qualifications for this position. If none are listed, there are no preferred qualifications.

Primary Location...

1345 Crossman Ave, Sunnyvale, CA 94089-1114, United States of America

Principal, Data Scientist, Experimentation Sciences

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Position Summary...

What you'll do...

Minimum Qualifications...

Preferred Qualifications...

Primary Location...

Position Summary...

What you'll do...

Minimum Qualifications...

Preferred Qualifications...

Primary Location...