What you'd actually do

Design and implement end-to-end ingestion pipelines from heterogeneous sources: including Snowflake, SQL Server, Excel, REST APIs, and unstructured data: into Azure Databricks

Architect and enforce Medallion Architecture (Bronze → Silver → Gold) ensuring data arrives clean, validated, and fit for purpose at each layer

Build Delta Live Tables (DLT) pipelines with declarative data quality expectations, schema evolution, and automated lineage tracking

Implement incremental loading patterns using CDC (Change Data Capture), watermarking, and Delta Lake MERGE/UPSERT for efficient, scalable ingestion

Enable structured and unstructured data processing: documents, Excel files, JSON, Parquet : building the foundation for AI and ML consumption

Skills

Required

Databricks
PySpark
Delta Lake
Workflows
Unity Catalog
Medallion Architecture
Domain Data Modeling
Functional Data Architecture
Data Quality Frameworks
Incremental loading
CDC
CI/CD
Observability
Python
SQL
Azure Databricks
Production environments

Nice to have

DLT
GCP
Azure
Kafka
Databricks Certified Professional
financial datasets
engineering datasets
enterprise datasets
industrial-scale datasets

What the JD emphasized

4+ years hands-on: PySpark, Delta Lake, Workflows, Unity Catalog

Demonstrate expertise in data strategy, for example: Medallion Architecture, Domain Data Modeling and Functional Data Architecture

Data Quality Frameworks (i.e. rule-based validation, anomaly detection)

Data Pipelines: incremental loading, CDC, CI/CD, Observability

Advanced Python/Pyspark and Advanced SQL

Proven experience building platforms, not just maintaining them: greenfield builds, migrations, framework development

Demonstrated ability to own technical decisions end-to-end: from architecture to production deployment

As a Senior Advanced Data Engineer here at Honeywell, you will play a crucial role in designing, developing, and maintaining advanced data solutions that drive business insights and support decision-making processes. You will leverage your expertise in data engineering to build scalable data pipelines, optimize data storage, and ensure data quality and integrity.

Your ability to work with cross-functional teams and translate business requirements into technical solutions will be key to your success in this role.

In this role, you will impact the business by enabling data-driven decision-making, optimizing data processes, and improving overall data management. Your work will contribute to increased operational efficiency, cost savings, and enhanced customer satisfaction.

At Honeywell, our people leaders play a critical role in developing and supporting our employees to help them perform at their best and drive change across the company. Help to build a strong, diverse team by recruiting talent, identifying, and developing successors, driving retention and engagement, and fostering an inclusive culture.

AI-Ready Data Platform

Design and implement end-to-end ingestion pipelines from heterogeneous sources: including Snowflake, SQL Server, Excel, REST APIs, and unstructured data: into Azure Databricks
Architect and enforce Medallion Architecture (Bronze → Silver → Gold) ensuring data arrives clean, validated, and fit for purpose at each layer
Build Delta Live Tables (DLT) pipelines with declarative data quality expectations, schema evolution, and automated lineage tracking
Implement incremental loading patterns using CDC (Change Data Capture), watermarking, and Delta Lake MERGE/UPSERT for efficient, scalable ingestion
Enable structured and unstructured data processing: documents, Excel files, JSON, Parquet : building the foundation for AI and ML consumption

Data Modeling & Semantic Layer

Design and implement the Engineering data model: dimensional models, fact/dimension tables, and domain-specific data marts: serving analytics, BI, ML and AI use cases
Build a governed, reusable semantic layer on top of the Gold layer, enabling self-service analytics through Power BI and GCP-connected consumers
Ensure data models are documented, versioned, and aligned to business domains within the VECE COE

Orchestration and Data Ops

Build and manage Databricks Workflows with multi-task dependencies, SLA monitoring, retry logic, and alerting
Implement CI/CD pipelines for Databricks using Azure DevOps and GitHub Actions : including Python Wheel packaging for reusable utility libraries deployed across the platform
Apply software engineering best practices: version control, unit testing, modular code design, and automated deployment to Dev/QA/Prod environments
Cluster right-sizing, DBU management, Delta table optimization (VACUUM, compaction), cost monitoring across Azure Databricks and GCP

Data Governance & Quality

Implement and manage Unity Catalog for centralized data governance: three-level namespace (catalog → schema → table), fine-grained RBAC, data masking, and audit logging
Build data quality frameworks: rule-based validation, deduplication, reconciliation, and anomaly detection: ensuring data arrives fit for AI/ML consumption
Establish data lineage tracking across ingestion, transformation, and serving layers
Govern data delivery to GCP: ensuring secure, validated, schema-consistent outputs consumed by downstream data science and analytics teams

AI & Proactive Analytics Foundation

Design pipelines that are AI-ready from day one: supporting structured ML feature pipelines, embedding generation, and future Vector DB integrations
Build the data infrastructure that enables the shift from descriptive dashboards to proactive, predictive analytics
Collaborate with Data Scientists and Analytics Engineers to ensure the Gold layer supports model training, feature stores, and real-time inference pipelines

YOU MUST HAVE

Databricks: 4+ years hands-on: PySpark, Delta Lake, Workflows, Unity Catalog.
Demonstrate expertise in data strategy, for example: Medallion Architecture, Domain Data Modeling and Functional Data Architecture.
Data Quality Frameworks (i.e. rule-based validation, anomaly detection)
Data Pipelines: incremental loading, CDC, CI/CD, Observability
Advanced Python/Pyspark and Advanced SQL
Strongly preferred: DLT, UC, GCP, Azure, Kafka.
Highly value Databricks Certified Professional
7+ years of overall data engineering experience
4+ years of hands-on Azure Databricks experience in production environments
Proven experience building platforms, not just maintaining them: greenfield builds, migrations, framework development
Experience with financial, engineering, enterprise, or industrial-scale datasets preferred
Demonstrated ability to own technical decisions end-to-end: from architecture to production deployment

#LI-Hybrid