Staff Data Engineer

Dropbox Dropbox · Enterprise · Mexico · CTO-Data Science, AI Platform & Eng (Sub Team)

Staff Data Engineer to join the Analytics Data Engineering (ADE) team within Data Science & AI Platform. Responsible for solving cross-cutting data challenges, driving standardization in analytics pipelines, modernizing the analytics platform, and laying the foundation for AI-native data development. Will partner closely with Data Science, Data Infrastructure, Product Engineering, and Business Intelligence teams. Key responsibilities include designing and implementing shared data models, driving standardization of data engineering practices, architecting shift-left data governance, and evaluating/integrating AI-native tooling.

What you'd actually do

  1. Lead the design and implementation of shared, reusable data models, defining shared fact tables, conformed dimensions, and a semantic/metrics layer that serves as the single source of truth across analytics functions
  2. Drive standardization of data engineering practices across ADE and functional analytics teams, including pipeline patterns, CI/CD workflows, naming conventions, and data modeling standards
  3. Partner with Data Infrastructure to modernize orchestration, improve pipeline decomposition, and establish secure dev/test environments with production data access
  4. Architect and implement a shift-left data governance strategy, working with upstream data producers to establish data contracts, SLOs, and code-enforced quality gates that catch issues before production
  5. Collaborate with Data Science leads and Product Management to translate metric definitions into reliable, certified data pipelines that power executive dashboards, WBR reporting, and growth measurement

Skills

Required

  • BS degree in Computer Science or related technical field, or equivalent technical experience
  • 12+ years of experience in data engineering or analytics engineering with increasing scope and technical leadership
  • 12+ years of SQL experience, including complex analytical queries, window functions, and performance optimization at scale (Spark SQL)
  • 8+ years of Python development experience, including building and maintaining production data pipelines
  • Deep expertise in dimensional data modeling, schema design, and scalable data architecture, with hands-on experience building shared data models across multiple business domains
  • Strong experience with orchestration tools (Airflow strongly preferred) and dbt, including pipeline design, scheduling strategies, and failure recovery patterns
  • Demonstrated ability to drive cross-team technical alignment, establishing standards, influencing without authority, and working across Data Engineering, Data Science, Data Infrastructure, and Product Engineering boundaries

Nice to have

  • Experience with Databricks (Unity Catalog, Delta Lake) and modern lakehouse architectures
  • Experience leading orchestration or platform modernization efforts at scale
  • Familiarity with data governance and observability tools such as Atlan, Monte Carlo, Great Expectations, or similar
  • Experience building or contributing to a metrics/semantic layer (dbt MetricFlow, Databricks Metric Views, or equivalent)
  • Track record of establishing data engineering standards and best practices in a federated analytics organization

What the JD emphasized

  • 12+ years of experience in data engineering or analytics engineering with increasing scope and technical leadership
  • 12+ years of SQL experience, including complex analytical queries, window functions, and performance optimization at scale (Spark SQL)
  • 8+ years of Python development experience, including building and maintaining production data pipelines
  • Deep expertise in dimensional data modeling, schema design, and scalable data architecture, with hands-on experience building shared data models across multiple business domains
  • Strong experience with orchestration tools (Airflow strongly preferred) and dbt, including pipeline design, scheduling strategies, and failure recovery patterns
  • Demonstrated ability to drive cross-team technical alignment, establishing standards, influencing without authority, and working across Data Engineering, Data Science, Data Infrastructure, and Product Engineering boundaries

Other signals

  • modernizing our analytics platform
  • building shared and reusable data models
  • establishing a certified metrics framework
  • laying the foundation for AI-native data development
  • Evaluate and integrate AI-native tooling into the data development lifecycle