Lead Data Engineer

Visa Visa · Fintech · Bellevue, WA

Lead Data Engineer responsible for the architecture, development, and optimization of large-scale data platforms and cloud-based analytics environments at Visa. This role involves providing architectural direction, leading complex engineering initiatives, mentoring teams, and hands-on work with modern data technologies. The position drives technical best practices, ensures platform scalability, and influences data engineering strategy for key products and business domains, with a focus on data governance, quality, observability, and cloud resource optimization.

What you'd actually do

  1. Lead the architecture and delivery of large-scale, high-performance data pipelines and processing frameworks across Hadoop and multi-cloud environments.
  2. Design scalable data models, lakehouse structures, and distributed data processing solutions that support analytics, machine learning, and real-time data needs.
  3. Provide technical leadership to Senior and Staff Data Engineers, conducting design reviews, guiding implementation decisions, and ensuring engineering excellence.
  4. Partner with cross-functional teams to translate business and product requirements into robust technical designs and data solutions.
  5. Develop and improve engineering best practices for data governance, quality, observability, testing, and cloud resource optimization.

Skills

Required

  • Advanced expertise in building and optimizing large-scale distributed data systems using Hadoop, Spark, and modern lakehouse architectures.
  • Strong programming proficiency in PySpark, Scala, and Python with experience implementing scalable, production-grade data applications.
  • Deep experience designing and tuning RDBMS, NoSQL, and distributed SQL systems.
  • Mastery of SQL and distributed query engines such as Presto, Trino, Hive, and SparkSQL.
  • Strong knowledge of data modeling, ETL/ELT design, and data warehousing methodologies.
  • Proven experience architecting and operating data solutions on AWS, GCP, and Azure, including cloud data lakes, orchestration tools, and cost-effective storage/compute designs.

Nice to have

  • Advanced proficiency in Databricks, including: Building and optimizing notebooks and production jobs, Delta Lake design and optimization, Cluster configuration and workspace administration, CI/CD integration for data workloads, Performance tuning for large distributed jobs.
  • Demonstrated ability to lead technical initiatives, communicate architectural decisions, and influence engineering direction across multiple teams.
  • Strong problem-solving skills with the ability to troubleshoot complex data and performance issues.

What the JD emphasized

  • large-scale distributed data systems
  • Hadoop
  • Spark
  • modern lakehouse architectures
  • PySpark
  • Scala
  • Python
  • RDBMS
  • NoSQL
  • distributed SQL systems
  • SQL
  • Presto
  • Trino
  • Hive
  • SparkSQL
  • data modeling
  • ETL/ELT design
  • data warehousing methodologies
  • AWS
  • GCP
  • Azure
  • cloud data lakes
  • orchestration tools
  • Databricks
  • Delta Lake
  • CI/CD integration for data workloads
  • performance tuning