Principal, Software Engineer – Conversational AI

Walmart · Retail · Bentonville, AR

Walmart's Cortex Team is seeking a Principal Software Engineer to build and evolve their core AI conversational platform. This role involves designing and implementing NLU services, orchestrating model-serving microservices, optimizing model serving for latency and cost, and potentially working on prompt engineering and agentic systems. The position requires strong software engineering fundamentals, experience with large-scale distributed systems, and a focus on scalability, performance, and cost trade-offs.

What you'd actually do

design, build, improve and evolve our capabilities in at least some of the following areas: Service oriented architecture in charge of exposing our NLU capabilities at scale, and enabling increasingly sophisticated model orchestration.
design and build the primitives to efficiently orchestrate model-serving microservices, taking into account their dependencies, and improving thecombinedlatency and robustness of such microservices (e.g. fan out in parallel to N services for a single request, and reply with whichever gives the fastest answer).
bake-in functionality which can drive improved machine learning modeling and experimental design, such as A/B testing.
Model serving and operations
drive principled and scientific load-testing efforts, to clearly identify the tradeoffs at hands, and tune/optimize the model-serving stack.

Skills

Required

8 + years experience in software engineering or related area
Solid data skills
sound computer-science fundamentals
strong programming experience
Deep hands-on technical expertise in full-stack development
Programming experience with at least one modern language with an efficient runtime, such as Scala, Java, C++, or C#
Experience with at least one relational database technology such as My SQL, Postgre SQL, Oracle, or MS SQL
Some level of fluency in Python
Understanding of the challenge of distributed data-processing at scale
Deal well with ambiguous/undefined problems
ability to think abstractly
Ability to take a project from scoping requirements through actual launch
A continuous drive to explore, improve, enhance, automate, and optimize systems and tools
Capacity to apply scientific analysis and mathematical modeling techniques
Excellent oral and written communication skills
Bachelors degree or certification in Computer Science, Engineering, Mathematics, or any other related field

Nice to have

Large scale distributed systems experience, including scalability and fault tolerance
Experience taking a leading role in building complex data-driven software systems successfully delivered to customers
Relentless focus on scalability, latency, performance robustness, and cost trade-offs especially those present in highly virtualized, elastic, cloud-based environments
Exposure to cloud infrastructure, such as Open Stack, Azure, GCP, or AWS as well as infrastructure management tech (Docker, Kubernetes)
Experience building/operating highly available systems of data extraction, ingestion, and massively parallel processing for large data sets
In particular experience in building large scale data pipelines using big data technologies (e.g. Spark / Kafka / Cassandra / Hadoop / Hive / Big Query / Presto / Airflow)
Hands-on expertise in many disparate technologies, typically ranging from front-end user interfaces through to back-end systems and all points in between
Familiarity with Machine Learning

What the JD emphasized

core A.I. conversational platform
personal assistants
multi-modal experiences
Natural Language Understanding (NLU) services
orchestration
model-serving microservices
scalability and availability
model serving latency
operational costs
model serving
load-testing efforts
prompt engineering and agentic systems
reproducible workflow and models
continuous deployment
resource management capabilities
diagnostics for quality control
labeling tools
mission critical product
large scale distributed systems experience
scalability and fault tolerance
building complex data-driven software systems
scalability, latency, performance robustness, and cost trade-offs
highly virtualized, elastic, cloud-based environments
building/operating highly available systems
large scale data pipelines

Other signals

building and designing the next generation of Natural Language Understanding (NLU) services
design and build the primitives to efficiently orchestrate model-serving microservices
bake-in functionality which can drive improved machine learning modeling and experimental design, such as A/B testing
Model serving and operations
drive principled and scientific load-testing efforts
prompt engineering and agentic systems

Read full job description

Position Summary...

What you'll do...

Cortex Team is Walmarts core A.I. conversational platform, powering the vision of delivering the worlds best personal assistants to Walmarts customers, accessible via natural voice commands, text messages, rich UI interactions, and a mix of all of the above via multi-modal experiences. We believeconversationsare a natural and powerful user interface for interacting with technology and enable a richer customer experiences both online and in-store. We are building and designing the next generation of Natural Language Understanding (NLU) services that other teams can easily integrate and leverage, and build rich experiences: from pure voice and text shopping assistants (Siri,Sparky), to customer care channels, to mobile apps with rich, intertwined, multi-modal interaction modes (Me@Walmart). Interested in diving in? We need solid engineers with the talent and expertise required to design, build, improve and evolve our capabilities in at least some of the following areas: Service oriented architecture in charge of exposing our NLU capabilities at scale, and enabling increasingly sophisticated model orchestration. Since the service takes in traffic for a large set of Walmart customers (that is 80% of American households!), you will get to solve non trivial challenges in terms of service scalability and availability. You will design and build the primitives to efficiently orchestrate model-serving microservices, taking into account their dependencies, and improving thecombinedlatency and robustness of such microservices (e.g. fan out in parallel to N services for a single request, and reply with whichever gives the fastest answer). You will also bake-in functionality which can drive improved machine learning modeling and experimental design, such as A/B testing. Model serving and operations There is a constant tension between model improvements (more computations) and model serving latency. So, we are always in a quest of crunching more numbers, while preserving our SLAs, and controlling the operational costs. You will guide our efforts to always find the best tradeoffs in terms of architecture, tooling (Tensorflow serving? / ONNYX? / Triton?) and infrastructure (CPU? / GPU?, GCP? / Azure?) for model serving based on the latest model developments and product requirements. In particular, you will drive principled and scientific load-testing efforts, to clearly identify the tradeoffs at hands, and tune/optimize the model-serving stack. If interested, you will also get some opportunity to work on prompt engineering and agentic systems. Tooling, infrastructure and pipelines for reproducible workflow and models, enabling rapid innovation across the entire product lifecycle. You will author and maintain pipelines that safely build and deploy models to production via continuous deployment. You will achieve scalable and efficient resource management capabilities (cloud infrastructure). You will provide robust and built-in diagnostics for quality control throughout. You will integrate or build labeling tools which can seamlessly integrate at the heart of our conversation data store (GCP, Big Query) and intertwine multiple labeling sources of various confidence levels. Come at the right time, and you will have an enormous opportunity to make a massive impact on the design, architecture, and implementation of an innovative, mission critical product, used every day, by people you know, and which customers love. As part of the emerging tech group, you will also have the additional opportunity of building demos, proof of concepts, creating white papers, writing blogs, etc. Note that this is not a fully remote job, you are required to come to the office (currently at least 2 days a week). Minimum Qualifications 8 + years experience in software engineering or related area. Solid data skills, sound computer-science fundamentals, and strong programming experience. Deep hands-on technical expertise in full-stack development. Programming experience with at least one modern language with an efficient runtime, such as Scala, Java, C++, or C#. Experience with at least one relational database technology such as My SQL, Postgre SQL, Oracle, or MS SQL. Some level of fluency in Python (lingua-franca of our data-scientists). Understanding of the challenge of distributed data-processing at scale. Deal well with ambiguous/undefined problems; ability to think abstractly. Ability to take a project from scoping requirements through actual launch. A continuous drive to explore, improve, enhance, automate, and optimize systems and tools. Capacity to apply scientific analysis and mathematical modeling techniques to predict, measure and evaluate the consequences of designs and the ongoing success of our platform. Excellent oral and written communication skills. Bachelors degree or certification in Computer Science, Engineering, Mathematics, or any other related field. Preferred Qualifications Large scale distributed systems experience, including scalability and fault tolerance. Experience taking a leading role in building complex data-driven software systems successfully delivered to customers Relentless focus on scalability, latency, performance robustness, and cost trade-offs especially those present in highly virtualized, elastic, cloud-based environments. Exposure to cloud infrastructure, such as Open Stack, Azure, GCP, or AWS as well as infrastructure management tech (Docker, Kubernetes) Experience building/operating highly available systems of data extraction, ingestion, and massively parallel processing for large data sets. In particular experience in building large scale data pipelines using big data technologies (e.g. Spark / Kafka / Cassandra / Hadoop / Hive / Big Query / Presto / Airflow). Hands-on expertise in many disparate technologies, typically ranging from front-end user interfaces through to back-end systems and all points in between. Familiarity with Machine Learning concepts ; processes Masters or Ph D in Computer Science, Physics, Engineering, Math, or equivalent. About Walmart Global TechImagine working in an environment where one line of code can make life easier for hundreds of millions of people. Thats what we do at Walmart Global Tech. Were a team of software engineers, data scientists, cybersecurity experts and service professionals within the worlds leading retailer who make an epic impact and are at the forefront of the next retail disruption. People are why we innovate, and people power our innovations. We are people-led and tech-empowered. We train our team in the skillsets of the future and bring in experts like you to help us grow. We have roles for those chasing their first opportunity as well as those looking for the opportunity that will define their career. Here, you can kickstart a great career in tech, gain new skills and experience for virtually every industry, or leverage your expertise to innovate at scale, impact millions and reimagine the future of retail. Walmarts culture is a competitive advantage, and its fostered by being together. Working together in person allows us to collaborate, align quickly and innovate with greater speed. We use our campuses to create purposeful connection rooted in deepening understanding and investing in the development of our associates. Our hubs: Walmart is a global company with offices across the United States and around the world. Our global headquarters is in Bentonville, Arkansas, with primary hubs in the San Francisco Bay area and New York/New Jersey. Benefits:Benefits: Beyond our great compensation package, you can receive incentive awards for your performance. Other great perks include 401(k) match, stock purchase plan, paid maternity and parental leave, PTO, multiple health plans, and much more.Equal Opportunity Employer:Walmart, Inc. is an Equal Opportunity Employer By Choice. We believe we are best equipped to help our associates, customers and the communities we serve live better when we really know them. That means understanding, respecting and valuing Belonging- unique styles, experiences, identities, ideas and opinions while being welcoming of all people. At Walmart, we offer competitive pay as well as performance-based bonus awards and other great benefits for a happier mind, body, and wallet. Health benefits include medical, vision and dental coverage. Financial benefits include 401(k), stock purchase and company-paid life insurance. Paid time off benefits include PTO (including sick leave), parental leave, family care leave, bereavement, jury duty, and voting. Other benefits include short-term and long-term disability, company discounts, Military Leave Pay, adoption and surrogacy expense reimbursement, and more. You will also receive PTO and/or PPTO that can be used for vacation, sick leave, holidays, or other purposes. The amount you receive depends on your job classification and length of employment. It will meet or exceed the requirements of paid sick leave laws, where applicable. For information about PTO, see https://one.walmart.com/notices. Live Better U is a Walmart-paid education benefit program for full-time and part-time associates in Walmart and Sam's Club facilities. Programs range from high school completion to bachelor's degrees, including English Language Learning and short-form certificates. Tuition, books, and fees are completely paid for by Walmart. Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to a specific plan or program terms. For information about benefits and eligibility, see One.Walmart. The annual salary range for this position is $143,000.00 - $286,000.00 Additional compensation includes annual or quarterly performance bonuses. Additional compensation for certain positions may also include :

Stock

ㅤ

‎

Minimum Qualifications...

__Outlined below are the required minimum qualifications for this position. If none are listed, there are no minimum qualifications. __

Option 1: Bachelor's degree in computer science, computer engineering, computer information systems, software engineering, or related area and 5 years’ experience in software engineering or related area. Option 2: 7 years’ experience in software engineering or related area.

Preferred Qualifications...

Outlined below are the optional preferred qualifications for this position. If none are listed, there are no preferred qualifications.

Master’s degree in computer science, computer engineering, computer information systems, software engineering, or related area and 3 years' experience in software engineering or related area., We value candidates with a background in creating inclusive digital experiences, demonstrating knowledge in implementing Web Content Accessibility Guidelines (WCAG) 2.2 AA standards, assistive technologies, and integrating digital accessibility seamlessly. The ideal candidate would have knowledge of accessibility best practices and join us as we continue to create accessible products and services following Walmart’s accessibility standards and guidelines for supporting an inclusive culture.

Primary Location...

2501 Se J St, Ste A, Bentonville, AR 72716-3724, United States of America

Walmart and its subsidiaries are committed to maintaining a drug-free workplace and has a no tolerance policy regarding the use of illegal drugs and alcohol on the job. This policy applies to all employees and aims to create a safe and productive work environment.