(usa) Principal, Data Engineer

Walmart · Retail · Sunnyvale, CA

Principal Data Engineer responsible for architecting and evolving the enterprise-scale data platform to power intelligent, autonomous systems, AI agents, and copilots. Focuses on defining data architecture, enabling agentic data, driving cross-domain strategy, and ensuring trust, reliability, and scale for AI-native workloads.

What you'd actually do

  1. Architect and evolve the core data platform strategy across batch, streaming, and hybrid systems to support both traditional analytics and AI-native workloads.
  2. Define enterprise patterns for agent-ready data, ensuring systems are discoverable, semantically rich, and optimized for LLMs, copilots, and multi-agent workflows.
  3. Shape reference architectures for RAG, real-time feature pipelines, vector indexing, and graph-augmented reasoning.
  4. Define standards for observability, telemetry, lineage, governance, and AI auditability across enterprise data systems.
  5. Lead modernization initiatives that transform traditional data lake systems into composable, event-driven, and agent-aware platforms.

Skills

Required

  • Cloud-native ecosystems (GCP or Azure preferred)
  • BigQuery, Dataflow, Pub/Sub, or equivalent
  • Batch and streaming systems (Kafka, Spark Structured Streaming, Flink, Druid, etc.)
  • Hybrid real-time + analytical architectures
  • Semantic modeling
  • Embeddings
  • Knowledge graphs
  • Vector search
  • RAG
  • Context enrichment
  • Agent orchestration
  • Schema, latency, storage format optimization for AI
  • Data quality frameworks
  • Access control
  • Lineage
  • Compliance
  • Auditability
  • Trust and safety standards for AI-enabled systems
  • Python, Java, or Scala
  • Spark/PySpark
  • SQL optimization at scale
  • Systems thinking
  • Performance optimization
  • Stakeholder alignment
  • Architectural adoption
  • Executive-level communication

Nice to have

  • traditional analytics
  • AI-native workloads
  • LLMs
  • copilots
  • multi-agent workflows
  • graph-augmented reasoning
  • modernization initiatives
  • composable, event-driven, and agent-aware platforms

What the JD emphasized

  • enterprise-scale technical authority
  • architect and steward the foundational data systems that enable AI agents, copilots, and large-agent workflows
  • define how the organization builds systems
  • partner with Engineering, AI/ML, Product, and Platform leaders
  • influence and guide multiple teams in adopting scalable patterns without direct authority
  • 10–15+ years building and evolving large-scale distributed data platforms
  • Proven track record of shaping architecture across multiple domains or organizations
  • Strong understanding of semantic modeling, embeddings, knowledge graphs, and vector search
  • Experience designing data systems that support RAG, context enrichment, and agent orchestration
  • Ability to reason about schema, latency, storage format, and their impact on AI reasoning quality
  • Advanced knowledge of data quality frameworks, access control, lineage, compliance, and auditability
  • Experience defining trust and safety standards for AI-enabled systems
  • Influence Without Authority
  • Demonstrated ability to align diverse stakeholders and drive architectural adoption across teams
  • Clear, executive-level communication and the ability to translate technical strategy into business value

Other signals

  • architecting foundational data systems for AI agents
  • defining enterprise patterns for agent-ready data
  • shaping reference architectures for RAG and agent orchestration