What you'd actually do

Design, develop, and maintain data pipelines using Databricks (PySpark / Spark SQL)

Implement and optimize ETL/ELT workflows using Databricks jobs, notebooks, and workflows

Build and manage Delta Lake tables, ensuring data reliability, performance, and ACID compliance

Develop and optimize data models for analytics, BI, and downstream consumption

Work with batch and streaming data processing using Spark Structured Streaming (where applicable)

Skills

Required

Databricks platform
Python (PySpark)
Spark SQL
Delta Lake
SQL
cloud data storage (AWS S3 / Azure Data Lake / GCP Cloud Storage)
ETL orchestration tools (Databricks Workflows, Airflow, Azure Data Factory, etc.)
Git

Nice to have

streaming technologies (Kafka, Event Hubs, Kinesis)
dbt
Unity Catalog
Databricks governance features
cloud security
IAM
cost optimization
BI tools (Power BI, Tableau, Looker)
data science or ML workflows on Databricks
Agile/Scrum

We are seeking a Data Engineer with 2 to 4 years of experience to design, build, and maintain scalable data pipelines using Databricks and cloud‑based data platforms. The ideal candidate will have hands‑on experience with Databricks Lakehouse architecture, building reliable ETL/ELT pipelines, and enabling analytics and data science use cases across the organization.

Your role will also include overseeing, supervising, and reviewing tasks performed by team members to ensure effective execution of work; managing end-to-end processes and projects for both internal and external clients with responsibility for timely and accurate delivery; issuing clear instructions and guidance to team members on assigned tasks; and mentoring and guiding junior colleagues to support their skill development, professional growth, and overall success.

Your role will also include overseeing, supervising and reviewing tasks performed by team members to ensure effective execution of work; managing end‑to‑end processes and projects for both internal and external clients with responsibility for timely and accurate delivery; issuing clear instructions and directions to team members on tasks to be performed; and mentoring and guiding junior colleagues to support their skill development, professional growth, and overall success

Design, develop, and maintain data pipelines using Databricks (PySpark / Spark SQL)
Implement and optimize ETL/ELT workflows using Databricks jobs, notebooks, and workflows
Build and manage Delta Lake tables, ensuring data reliability, performance, and ACID compliance
Develop and optimize data models for analytics, BI, and downstream consumption
Work with batch and streaming data processing using Spark Structured Streaming (where applicable)
Collaborate with data scientists, analysts, and product teams to deliver trusted datasets
Ensure data quality, validation, and monitoring across pipelines
Optimize Spark jobs for cost and performance (partitioning, caching, tuning)
Follow best practices for code versioning, documentation, and deployment
Support production workloads and assist with troubleshooting data issues
2–4 years of professional experience as a Data Engineer
Strong hands‑on experience with Databricks platform
Proficiency in Python (PySpark) and Spark SQL
Solid experience with Delta Lake, including merges and time travel
Strong SQL skills for data transformation and analysis
Experience with cloud data storage (AWS S3 / Azure Data Lake / GCP Cloud Storage)
Understanding of data warehousing and lakehouse concepts
Experience with ETL orchestration tools (Databricks Workflows, Airflow, Azure Data Factory, etc.)
Familiarity with Git and version control practices

Good to Have -

Experience with streaming technologies (Kafka, Event Hubs, Kinesis)
Exposure to dbt, Unity Catalog, or Databricks governance features
Knowledge of cloud security, IAM, and cost optimization
Experience supporting BI tools (Power BI, Tableau, Looker)
Understanding of data science or ML workflows on Databricks
Experience working in Agile/Scrum teams

Design, develop, and maintain data pipelines using Databricks (PySpark / Spark SQL)
Implement and optimize ETL/ELT workflows using Databricks jobs, notebooks, and workflows
Build and manage Delta Lake tables, ensuring data reliability, performance, and ACID compliance
Develop and optimize data models for analytics, BI, and downstream consumption
Work with batch and streaming data processing using Spark Structured Streaming (where applicable)
Collaborate with data scientists, analysts, and product teams to deliver trusted datasets
Ensure data quality, validation, and monitoring across pipelines
Optimize Spark jobs for cost and performance (partitioning, caching, tuning)
Follow best practices for code versioning, documentation, and deployment
Support production workloads and assist with troubleshooting data issues
2–4 years of professional experience as a Data Engineer
Strong hands‑on experience with Databricks platform
Proficiency in Python (PySpark) and Spark SQL
Solid experience with Delta Lake, including merges and time travel
Strong SQL skills for data transformation and analysis
Experience with cloud data storage (AWS S3 / Azure Data Lake / GCP Cloud Storage)
Understanding of data warehousing and lakehouse concepts
Experience with ETL orchestration tools (Databricks Workflows, Airflow, Azure Data Factory, etc.)
Familiarity with Git and version control practices

Good to Have -

Experience with streaming technologies (Kafka, Event Hubs, Kinesis)
Exposure to dbt, Unity Catalog, or Databricks governance features
Knowledge of cloud security, IAM, and cost optimization
Experience supporting BI tools (Power BI, Tableau, Looker)
Understanding of data science or ML workflows on Databricks
Experience working in Agile/Scrum teams