Data Engineer, Analytics Data Engineering

Dropbox Dropbox · Enterprise · Canada +2 · CTO-Data Science, AI Platform & Eng (Sub Team)

Dropbox is hiring a Data Engineer for their Analytics Data Engineering team. This role involves building large, scalable analytics pipelines from scratch using modern Big Data technologies, focusing on data modeling, Spark jobs, data integrations, and data quality frameworks. The engineer will collaborate with business units and other engineering teams to define data platform architecture and conceptualize data architecture for large-scale projects. Responsibilities include designing, building, and optimizing data models, visualizations, and pipelines to support various use cases. The role requires significant experience with Spark, Python/Java/Scala, SQL, schema design, dimensional data modeling, medallion architectures, and the Databricks platform. Experience with orchestration frameworks like Airflow and data quality monitoring tools is preferred. The role is part of a team that may have on-call rotations.

What you'd actually do

  1. Help define company data assets (data model), Spark, SparkSQL jobs to populate data models
  2. Help define and design data integrations, data quality frameworks and design and evaluate open source/vendor tools for data lineage
  3. Work closely with Dropbox business units and engineering teams to develop strategy for long term Data Platform architecture to be efficient, reliable and scalable
  4. Conceptualize and own the data architecture for multiple large-scale projects, while evaluating design and operational cost-benefit tradeoffs within systems
  5. Collaborate with engineers, product managers, and data scientists to understand data needs, representing key data insights in a meaningful way

Skills

Required

  • Spark
  • Python
  • Java
  • C++
  • Scala
  • SQL
  • schema design
  • dimensional data modeling
  • medallion architectures
  • Databricks platform
  • data lake architectures
  • product strategic thinking
  • communications
  • data processing systems

Nice to have

  • Airflow
  • data quality monitoring
  • MonteCarlo

What the JD emphasized

  • building from the ground up
  • building new things without being constrained by technical debt