Staff, Software Engineer

Walmart · Retail · Bentonville, AR +1

Staff Software Engineer focused on building foundational agentic AI systems for Walmart's Site Reliability Engineering organization. The role involves architecting and developing high-availability, resilient platforms that autonomously monitor, predict, and resolve issues across Walmart's vast technology ecosystem, impacting millions of customers and associates. The goal is to transform traditional SRE practices into intelligent, self-healing systems.

What you'd actually do

  1. Design and architect advanced frontend systems and platforms that serve as the foundation for reliability engineering and operations tools across Walmart's technology ecosystem
  2. Establish technical strategies and frontend architecture patterns that enable scalability, maintainability, and performance across distributed teams
  3. Lead the design of intelligent observability and monitoring interfaces that leverage modern UI patterns for anomaly detection, predictive analytics, and user-driven insights
  4. Architect frontend solutions that integrate with backend systems to deliver cohesive, intuitive experiences for engineers and operations teams
  5. Design and deliver code that's readable, maintainable, testable, scalable, reusable, and efficient

Skills

Required

  • Node.js
  • React
  • Next.js
  • TypeScript
  • GraphQL
  • Cytoscape.js
  • KeyLines SDK
  • Modern frontend tooling and testing frameworks
  • CI/CD systems such as Jenkins
  • HTML
  • CSS
  • JavaScript
  • React and modern JavaScript frameworks
  • graph visualizations
  • caching
  • logging
  • performance tuning
  • monitoring strategies
  • full app development life cycle
  • GitHub
  • version control best practices
  • unit testing
  • testing frameworks
  • CI/CD systems
  • distributed team communication
  • mentoring
  • leading engineering teams
  • scalable frontend architectures

Nice to have

  • KeyLines SDK

What the JD emphasized

  • agentic AI systems
  • autonomously monitor, predict, and resolve issues
  • self-healing platforms
  • intelligent backbone for reliability engineering
  • Tier 0 high-availability, resilient agentic platforms

Other signals

  • agentic AI systems
  • autonomously monitor, predict, and resolve issues
  • self-healing platforms
  • intelligent backbone for reliability engineering