Senior, Data Engineer

Walmart Walmart · Retail · Chennai, India

Senior Data Engineer at Walmart focused on building and maintaining data pipelines, transforming data, and ensuring data governance. The role involves working with large-scale data sets to support data scientists and analysts, contributing to an orchestration layer for data transformations, and refining raw data into valuable assets.

What you'd actually do

  1. Data Strategy: Understands, articulates and applies principles of the defined strategy to routine business problems that involve a single function.
  2. Data Transformation and Integration: Extracts data from identified databases. Creates data pipelines and transform data to a structure that is relevant to the problem by selecting appropriate techniques. Develops knowledge of current analytics trends.
  3. Data Source Identification: Supports the understanding of the priority order of requirements and service level agreements. Helps identify the most suitable source for data that is fit for purpose. Performs initial data quality checks on extracted data.
  4. Data Modelling: Analyses complex data elements, systems, data flows, dependencies, and relationships to contribute to conceptual, physical and logical data models. Develops the Logical Data Model and Physical Data Models including data warehouse and data mart designs. Defines relational tables, primary and foreign keys and stored procedures to create a data model structure. Evaluates existing data models and physical databases for variances and discrepancies. Develops efficient data flows. Analyses data-related system integration challenges and proposes appropriate solutions.
  5. Code Development and Testing: Writes code to develop the required solution and application features by determining the appropriate programming language and leveraging business, technical and data requirements. Creates test cases to review and validate the proposed solution design. Creates proofs of concept. Tests the code using the appropriate testing approach. Deploys software to production servers. Contributes code documentation, maintains playbooks, and provides timely progress updates.

Skills

Required

  • Hadoop
  • Hive
  • Spark using Scala
  • Kubernetes
  • Cloud
  • API
  • Data Lake concepts
  • Java
  • Python
  • GCP
  • Azure
  • data modelling
  • data migration protocols
  • Automic
  • Airflow

Nice to have

  • Kafka connect
  • Druid
  • Big Query
  • Looker

What the JD emphasized

  • high standards
  • extremely high standard of code quality, system reliability, and performance