Data Engineer III

Walmart Walmart · Retail · Bangalore, KA, India

Data Engineer III at Walmart focused on building scalable, privacy-aware data pipelines and infrastructure. The role involves designing and implementing ETL/ELT workflows, collaborating with data scientists and platform engineers, and ensuring data governance and privacy principles are embedded in systems. The position emphasizes distributed data systems, data architecture, and privacy-by-design methodologies.

What you'd actually do

  1. Identify and acquire data sets that align with organisational business needs and strategic objectives
  2. Establish and support the development of robust data streaming systems to enable real-time data processing and delivery
  3. Develop comprehensive business intelligence reports and dashboards for executive leadership and company advisors
  4. Develop sophisticated algorithms to transform raw data into actionable business insights
  5. Establish, validate, and maintain optimal database pipeline architectures to support enterprise scale operations

Skills

Required

  • B.E/B.Tech/MS in Computer Science or related field with 5+ years of experience in large-scale distributed systems
  • Hands-on experience with big data offerings including GCP BigQuery and Spark Streaming
  • Hands-on experience with big data tools (Hadoop, Apache Spark, Hive, Hudi) and workflow management (Apache Airflow)
  • Strong experience with SQL and NoSQL databases (Cassandra, Cosmos) and distributed SQL (Presto/Trino)
  • Experience with integration tools like Airflow and workflow automation
  • Expertise in querying and monitoring logs using Grafana and Splunk
  • Experience with CI/CD automation using Jenkins and GitHub Actions

Nice to have

  • Familiarity with in-memory processing and modern data formats (Avro, Parquet, JSON) and open table formats like Hudi

What the JD emphasized

  • privacy-aware data pipelines
  • data governance
  • privacy principles
  • privacy, security, and performance are balanced by design
  • Maintain strict adherence to data governance frameworks and security policies