Advanced Data Engineer

Honeywell Honeywell · Industrial · Bengaluru, Karnataka, India

Honeywell's VECE COE is building an AI-Ready data platform to power advanced analytics and AI-driven decision-making. This Senior Data Engineer will architect and build end-to-end data pipelines using Azure Databricks, transforming raw data into analytics-ready assets for downstream AI/ML consumption. The role focuses on data ingestion, modeling, orchestration, and governance, emphasizing building and delivering pipelines rather than maintenance.

What you'd actually do

  1. Implement end-to-end ingestion pipelines from heterogeneous sources (i.e. Snowflake, SQL Server, Excel, REST APIs, and unstructured files) into Azure Databricks following defined architecture patterns
  2. Build and maintain Bronze → Silver → Gold Medallion layers, applying transformation logic, business rules, and quality checks at each stage
  3. Implement incremental loading pattern (i.e. CDC, watermarking, Delta Lake MERGE/UPSERT) to ensure efficient, scalable, and reliable data delivery
  4. Develop pipelines for structured and unstructured data (i.e. documents, JSON, Parquet, Excel) supporting AI and ML consumption downstream
  5. Build and manage Databricks Workflows: configuring task dependencies, retry policies, and failure alerting

Skills

Required

  • Databricks: 2+ years hands-on: PySpark, Delta Lake, Workflows, Unity Catalog
  • Medallion Architecture
  • Domain Data Modeling
  • Functional Data Architecture
  • Data Quality Frameworks (rule-based validation, anomaly detection)
  • Data Pipelines: incremental loading, CDC, CI/CD, Observability
  • Advanced Python/Pyspark
  • Advanced SQL
  • 4-6+ years of overall data engineering experience
  • 2+ years of hands-on Azure Databricks experience in production environments
  • Experience working within a defined architecture and contributing to its improvement
  • Comfortable working with multiple data source types — relational, file-based, API

Nice to have

  • DLT
  • UC
  • GCP
  • Azure
  • Kafka
  • Databricks Certified Professional

What the JD emphasized

  • AI-Ready data platform
  • AI and ML consumption downstream
  • AI-driven decision-making
  • build and deliver pipelines — not just maintain or support them

Other signals

  • AI-Ready data platform
  • power advanced analytics, predictive insights, and data science
  • transform raw, multi-source data into governed, high-quality, analytics-ready assets
  • transition from traditional descriptive analytics to proactive, AI-driven decision-making