(usa) Distinguished, Software Engineer

Walmart Walmart · Retail · Bentonville, AR +2

Distinguished Engineer for the Colony Platform in AI & Data, responsible for shaping the long-term technical vision, architecture, and engineering excellence of enterprise-scale AI/ML systems and platforms. Focuses on driving technical strategy, architecting for scale, mentoring technical leaders, and embedding responsible AI foundations. The role involves leading architecture reviews for GenAI integrations, model serving, and intelligent pipelines, influencing AI/ML lifecycle practices, and shaping patterns for safe AI adoption. Requires deep experience in AI/ML systems productionization, model serving, orchestration, performance, and observability.

What you'd actually do

  1. Define and drive the technical vision for AI/ML platforms and ecosystems that operate at enterprise scale.
  2. Lead architecture reviews and design principles across GenAI integrations, model serving, and intelligent pipelines.
  3. Provide hands-on technical guidance and thought leadership to multi-team engineering and data science groups.
  4. Influence AI/ML lifecycle practices, including model evaluation, deployment, monitoring, and observability.
  5. Raise engineering excellence by mentoring senior ICs, principal engineers, and emerging technical talent.

Skills

Required

  • 10+ years of experience building and operating large-scale, distributed software systems.
  • Proven track record as a senior individual contributor driving system architecture or technical strategy.
  • Deep experience with AI/ML systems productionization, model serving, orchestration, and performance considerations.
  • Experience with observability: structured logging, metrics, traces, and debugging distributed flows across client + gateway.
  • Strong expertise in scalable cloud-native design and platform engineering.
  • Demonstrated ability to influence engineers and leaders without direct managerial authority.
  • Excellent communication and technical storytelling skills, with the ability to articulate complex concepts to technical and non-technical audiences.

Nice to have

  • Experience leading AI/ML systems or platforms in a large enterprise environment.
  • Experience with LLM tool-calling / agentic systems (structured tool invocation, schema validation, prompt/tool definition design, guardrails).
  • Experience with schema/contract frameworks (JSON Schema, OpenAPI, Pydantic, protobuf) and backward-compatible tool evolution.
  • Familiarity with responsible AI practices and governance frameworks.
  • Experience with GenAI, prompt engineering patterns, or large-scale inference infrastructure.
  • Proven mentorship and leadership across multiple engineering teams.

What the JD emphasized

  • shaping the long-term technical vision
  • architecture
  • engineering excellence
  • enterprise-scale AI/ML systems and platforms
  • driving technical strategy
  • architecting for scale
  • mentoring technical leaders
  • GenAI integrations
  • model serving
  • intelligent pipelines
  • AI/ML lifecycle practices
  • model evaluation
  • deployment
  • monitoring
  • observability
  • safe and governed AI adoption
  • cutting-edge prototyping
  • production strategies
  • mission-critical AI systems
  • large-scale, distributed software systems
  • system architecture
  • technical strategy
  • AI/ML systems productionization
  • model serving
  • orchestration
  • performance considerations
  • observability
  • structured logging
  • metrics
  • traces
  • debugging distributed flows
  • scalable cloud-native design
  • platform engineering
  • influence engineers and leaders without direct managerial authority
  • LLM tool-calling / agentic systems
  • structured tool invocation
  • schema validation
  • prompt/tool definition design
  • guardrails
  • GenAI
  • prompt engineering patterns
  • large-scale inference infrastructure
  • technical compass
  • how we build, scale, and govern AI systems

Other signals

  • driving technical strategy
  • architecting for scale
  • mentoring technical leaders
  • enterprise-scale AI/ML systems and platforms
  • GenAI integrations
  • model serving
  • intelligent pipelines
  • responsible AI foundations
  • scalable, resilient solutions
  • AI/ML lifecycle practices
  • model evaluation, deployment, monitoring, and observability
  • safe and governed AI adoption
  • cutting-edge prototyping and production strategies
  • mission-critical AI systems