Principal, Data Architect

Walmart · Retail · Sunnyvale, CA

Principal Data Architect to design and build a 'Semantic World Model' for Sam's Club, unifying data for intelligent agents. This involves architecting real-time state management, a tool-calling fabric for data products, enterprise standards for Agent Ready Data (GraphRAG, vector optimization), and reasoning guardrails for AI grounding. The role emphasizes an AI-native operating model and modernization of legacy systems.

What you'd actually do

  1. Architect the Semantic World Model: Lead the design of a global semantic layer that unifies structured, semi structured, and unstructured data into shared, context aware representations (Knowledge Graphs and Metadata Tensors) optimized for LLM reasoning.
  2. Design for Real Time State: Define the architecture for persistent agent memory and real time state management. Ensure agents have a seamless "mental model" that synchronizes a member’s current session with their long term history across all channels.
  3. Establish the Tool-Calling Fabric: Transition the enterprise from "Data-as-a-Table" to "Data-as-a-Tool for Agentic Enablement" Design the architectural contracts and schemas that allow agents to call data products as executable functions with deterministic outcomes.
  4. Lead the Agentic Reference Architecture: Define and steward enterprise standards for Agent Ready Data. This includes creating reference patterns for GraphRAG (Graph-Augmented Generation), vector space optimization, and hierarchical ontology design.
  5. Architect Trust & Grounding: Design "Reasoning Guardrails" into the data layer. Ensure that the semantic architecture provides verifiable grounding for AI actions, minimizing hallucinations through strict policy enforcement and deterministic data paths.

Skills

Required

  • 10–15+ years of experience in large scale distributed data architecture
  • Architectural Fluency in Agentic Systems
  • Deep expertise in the "Context Stack"—including Vector Databases, Enterprise Knowledge Graphs, and the orchestration of multi agent workflows.
  • Expert level knowledge of Ontology Engineering and Metadata Management.
  • Deep understanding of real time event processing and how to blend high volume streaming data with analytical stores
  • Systems Thinking
  • Python, Java, or Scala
  • Databricks
  • Spark
  • BigQuery
  • Kafka
  • Druid
  • Spark Structured Streaming
  • Kafka Connect
  • Apache Flink
  • LangChain/LangGraph
  • Prompt Engineering
  • Camunda
  • LookML
  • GCP or Azure cloud native ecosystems

Nice to have

  • MCP Server
  • Multimodal AI

What the JD emphasized

  • Cognitive Data Infrastructure
  • Semantic World Model
  • AI-native operating model
  • Architectural Fluency in Agentic Systems
  • Ontology Engineering
  • Metadata Management
  • Agent Ready Data
  • GraphRAG
  • vector space optimization
  • Reasoning Guardrails

Other signals

  • Architecting a unified, real-time, and context-aware layer for intelligent agents
  • Designing frameworks for AI reasoning across data and persistent memory
  • Establishing data contracts for agents to call data products as executable functions
  • Defining enterprise standards for Agent Ready Data including GraphRAG and vector space optimization
  • Building a decoupled semantic interface for model-agnostic intelligence
  • Designing 'Reasoning Guardrails' for verifiable grounding and hallucination minimization