Data Engineer III

Walmart Walmart · Retail · Chennai, India

Data Engineer III role focused on building and maintaining data pipelines, data lakes, and ETL processes for Walmart's Global Data Organization. The role involves designing, developing, and testing data solutions, optimizing data utilization, and ensuring data availability and accuracy. It requires experience with Big Data technologies, distributed computing, cloud platforms, and strong SQL and scripting skills.

What you'd actually do

  1. Develop and implement best-in-class Data pipeline and Data lake, Consumption/Acceleration layers to ensure best capacity utilisation and meet SLAs.
  2. Demonstrate and transform business requirements to code, specific analytical reports and tools.
  3. Design, build, test and deploy cutting edge solutions at scale, impacting multi-billion-dollar business.
  4. Work closely with product owner and technical lead and play a major role in the overall delivery of the assigned project/enhancements.
  5. Learn & Research on the go and work on both new requests/projects as well as support production.

Skills

Required

  • Big Data Technologies
  • Hadoop
  • Spark
  • Hive
  • Dataproc
  • BigQuery
  • Kafka
  • Airflow Scheduler
  • Java
  • Python
  • Scala
  • AWS
  • Azure
  • GCP
  • SQL
  • Scripting

Nice to have

  • Linux systems
  • secure, scalable and highly available services
  • data science
  • machine learning

What the JD emphasized

  • 3+ years’ experience
  • Minimum 2 years of experience in Big Data and distributed computing.
  • Proven experience building pipelines on Big Data Technologies/Stack – Hadoop, Spark, Hive, dataproc, BigQuery, Kafka and Airflow Scheduler to name a few.
  • Deep understanding of the Hadoop ecosystem and strong conceptual knowledge in Hadoop architecture components and experience in working on at least one Big Data technology with Java, Python or Scala.
  • Strong knowledge of deploying and managing applications in AWS or Azure or GCP.
  • Strong scripting skills to process large amount of data and highly proficient in SQL.