What you'd actually do

Architect, design, and lead the development of end‑to‑end data pipelines on Azure using Databricks (Spark / PySpark)

Own the design and evolution of lakehouse architecture using Azure Data Lake Storage (ADLS Gen2) and Delta Lake

Build, optimize, and scale batch and streaming pipelines for large‑volume, high‑velocity datasets

Design and manage feature engineering pipelines and curated datasets for AI/ML model training, validation, and inference

Partner closely with Data Scientists and ML Engineers to enable scalable, production‑ready ML workflows

Skills

Required

Python
PySpark
Spark SQL
Databricks
Azure Data Lake Storage (ADLS Gen2)
Azure Databricks
Azure Data Factory
Synapse Pipelines
Delta Lake
SQL
CI/CD
Git
DevOps

Nice to have

Azure Machine Learning
Databricks ML
Feature Store
MLflow
experiment tracking
Kafka
Azure Event Hubs
Spark Structured Streaming
dbt
Unity Catalog
enterprise data governance tools
Power BI
MLOps
Technical Lead
Principal Engineer
Architecture Owner
LangChain
Agent
Agent Architecture

What the JD emphasized

9+ years of hands‑on experience in Data Engineering, Data Platform, or Big Data roles

Deep expertise in Python, PySpark, and Spark SQL

Extensive, real‑world experience with Databricks

Strong experience with Azure cloud services

Expert‑level understanding of Delta Lake

Proven experience designing AI/ML data pipelines (training, validation, inference datasets)

Strong understanding of lakehouse, data warehousing, and dimensional modeling concepts

Hands‑on experience with CI/CD pipelines, Git, and DevOps practices for data platforms

Excellent troubleshooting, diagnostics, and performance tuning skills

Strong communication and stakeholder collaboration abilities

Other signals

architect, build, and scale cloud-native data and AI platforms on Azure using Databricks

lead complex, enterprise-scale data initiatives

work closely with data scientists and ML engineers

shaping the organization’s data and AI strategy

Design and manage feature engineering pipelines and curated datasets for AI/ML model training, validation, and inference

Partner closely with Data Scientists and ML Engineers to enable scalable, production-ready ML workflows

Support and integrate with MLOps pipelines

Lead optimization of Databricks workloads for performance, scalability, reliability, and cost efficiency

Define and implement data quality, validation, monitoring, and observability frameworks

Enforce data security, governance, and compliance

Mentor and technically guide senior, mid-level, and junior data engineers

Lead architectural decision-making and contribute to long-term data platform and AI roadmap planning

Act as a technical authority and escalation point for complex data engineering challenges

We are looking for a Senior Advanced Data Engineer with 9+ years of experience to architect, build, and scale cloud‑native data and AI platforms on Azure using Databricks. This is a senior, hands‑on technical leadership role requiring deep expertise in data engineering, lakehouse architecture, and AI/ML data pipelines to enable advanced analytics, machine learning, and business intelligence use cases.

The ideal candidate will lead complex, enterprise‑scale data initiatives, work closely with data scientists and ML engineers, and play a critical role in shaping the organization’s data and AI strategy.

Architect, design, and lead the development of end‑to‑end data pipelines on Azure using Databricks (Spark / PySpark)
Own the design and evolution of lakehouse architecture using Azure Data Lake Storage (ADLS Gen2) and Delta Lake
Build, optimize, and scale batch and streaming pipelines for large‑volume, high‑velocity datasets
Design and manage feature engineering pipelines and curated datasets for AI/ML model training, validation, and inference
Partner closely with Data Scientists and ML Engineers to enable scalable, production‑ready ML workflows
Support and integrate with MLOps pipelines, including:
- Data and feature versioning
- Feature stores
- Model deployment readiness
Lead optimization of Databricks workloads for performance, scalability, reliability, and cost efficiency
Define and implement data quality, validation, monitoring, and observability frameworks
Enforce data security, governance, and compliance using Azure and Databricks best practices
Review designs and code, establish engineering standards, and ensure platform reliability
Mentor and technically guide senior, mid‑level, and junior data engineers
Lead architectural decision‑making and contribute to long‑term data platform and AI roadmap planning
Act as a technical authority and escalation point for complex data engineering challenges

Required Skills & Qualifications

9+ years of hands‑on experience in Data Engineering, Data Platform, or Big Data roles
Deep expertise in Python, PySpark, and Spark SQL
Extensive, real‑world experience with Databricks, including:
- Jobs, notebooks, workflows
- Delta Live Tables
- Performance tuning and job orchestration
Strong experience with Azure cloud services, including:
- Azure Data Lake Storage (ADLS Gen2)
- Azure Databricks
- Azure Data Factory and/or Synapse Pipelines
Expert‑level understanding of Delta Lake, including ACID guarantees, schema enforcement, and optimizations
Advanced SQL skills for analytical data modeling and transformations
Proven experience designing AI/ML data pipelines (training, validation, inference datasets)
Strong understanding of lakehouse, data warehousing, and dimensional modeling concepts
Hands‑on experience with CI/CD pipelines, Git, and DevOps practices for data platforms
Excellent troubleshooting, diagnostics, and performance tuning skills
Strong communication and stakeholder collaboration abilities

Preferred / Nice to Have Skills

Experience with Azure Machine Learning or Databricks ML
Hands‑on experience with Feature Store, MLflow, or experiment tracking frameworks
Streaming data experience using Kafka, Azure Event Hubs, or Spark Structured Streaming
Experience with dbt, Unity Catalog, or enterprise data governance tools
Familiarity with Power BI or other BI/visualization tools
Strong exposure to production‑grade MLOps systems and best practices
Prior experience as a Technical Lead, Principal Engineer, or Architecture Owner
Knowledge on LangChain , Agent, Agent Architecture.

Architect, design, and lead the development of end‑to‑end data pipelines on Azure using Databricks (Spark / PySpark)
Own the design and evolution of lakehouse architecture using Azure Data Lake Storage (ADLS Gen2) and Delta Lake
Build, optimize, and scale batch and streaming pipelines for large‑volume, high‑velocity datasets
Design and manage feature engineering pipelines and curated datasets for AI/ML model training, validation, and inference
Partner closely with Data Scientists and ML Engineers to enable scalable, production‑ready ML workflows
Support and integrate with MLOps pipelines, including:
- Data and feature versioning
- Feature stores
- Model deployment readiness
Lead optimization of Databricks workloads for performance, scalability, reliability, and cost efficiency
Define and implement data quality, validation, monitoring, and observability frameworks
Enforce data security, governance, and compliance using Azure and Databricks best practices
Review designs and code, establish engineering standards, and ensure platform reliability
Mentor and technically guide senior, mid‑level, and junior data engineers
Lead architectural decision‑making and contribute to long‑term data platform and AI roadmap planning
Act as a technical authority and escalation point for complex data engineering challenges

Required Skills & Qualifications

9+ years of hands‑on experience in Data Engineering, Data Platform, or Big Data roles
Deep expertise in Python, PySpark, and Spark SQL
Extensive, real‑world experience with Databricks, including:
- Jobs, notebooks, workflows
- Delta Live Tables
- Performance tuning and job orchestration
Strong experience with Azure cloud services, including:
- Azure Data Lake Storage (ADLS Gen2)
- Azure Databricks
- Azure Data Factory and/or Synapse Pipelines
Expert‑level understanding of Delta Lake, including ACID guarantees, schema enforcement, and optimizations
Advanced SQL skills for analytical data modeling and transformations
Proven experience designing AI/ML data pipelines (training, validation, inference datasets)
Strong understanding of lakehouse, data warehousing, and dimensional modeling concepts
Hands‑on experience with CI/CD pipelines, Git, and DevOps practices for data platforms
Excellent troubleshooting, diagnostics, and performance tuning skills
Strong communication and stakeholder collaboration abilities

Preferred / Nice to Have Skills

Experience with Azure Machine Learning or Databricks ML
Hands‑on experience with Feature Store, MLflow, or experiment tracking frameworks
Streaming data experience using Kafka, Azure Event Hubs, or Spark Structured Streaming
Experience with dbt, Unity Catalog, or enterprise data governance tools
Familiarity with Power BI or other BI/visualization tools
Strong exposure to production‑grade MLOps systems and best practices
Prior experience as a Technical Lead, Principal Engineer, or Architecture Owner
Knowledge on LangChain , Agent, Agent Architecture.