Senior Software Engineer - Web Data Team

ZoomInfo ZoomInfo · Enterprise · Vancouver, BC · 936 Engineering - Data Engineering

Senior Software Engineer to join the Web Data team, focusing on building and operating large-scale web crawling and data extraction infrastructure. This role involves designing and implementing scalable, fault-tolerant pipelines, writing production code in Java and Python, working with cloud infrastructure (GCP/AWS, GKE), and improving system observability and reliability. The position emphasizes strong software engineering fundamentals and experience with data engineering and cloud technologies.

What you'd actually do

  1. Design and implement components of scalable, fault-tolerant web crawling and extraction pipelines
  2. Write clean, production-grade code in Java and Python
  3. Build and operate ETL/ELT pipelines for large-scale data extraction and transformation
  4. Work with cloud infrastructure on GCP and AWS, primarily on GKE
  5. Improve observability, reliability, and operational excellence across the systems you contribute to

Skills

Required

  • 5+ years of professional software engineering experience building production systems
  • Strong CS fundamentals: algorithms, data structures, concurrency, distributed systems
  • Proficiency in Java and/or Python
  • Track record of owning features end-to-end from design through deployment and operation
  • Comfortable making sound architectural decisions at the component level
  • Hands-on experience with cloud data warehouses such as BigQuery or Snowflake
  • Experience designing and operating large-scale ETL/ELT pipelines
  • Experience with orchestration tools such as Apache Airflow
  • Experience with streaming or event-driven systems such as Apache Kafka
  • Production experience on GCP (preferred) or AWS; multi-cloud exposure is a plus
  • Hands-on experience with Kubernetes (GKE/EKS) for distributed workloads
  • Familiarity with infrastructure-as-code tooling such as Terraform
  • Strong communicator who can explain technical decisions clearly
  • Comfortable operating in ambiguity and iterating quickly
  • Bias toward action and pragmatic problem solving
  • Self-starter who thrives in fast-paced, evolving environments

Nice to have

  • Experience with web crawling at scale (Scrapy or similar frameworks)
  • Familiarity with proxy infrastructure, rotation strategies, or anti-bot evasion techniques
  • Experience in extracting structured and unstructured web data from diverse site architectures
  • Knowledge of SERP (Search Engine Results Page) extraction
  • Comfort with AI/LLM-based extraction approaches, applying language models to HTML at scale
  • Experience working in a B2B data company or data-as-a-product environment

What the JD emphasized

  • strong software engineering fundamentals
  • great engineer first
  • 5+ years of professional software engineering experience building production systems
  • Strong CS fundamentals: algorithms, data structures, concurrency, distributed systems
  • Track record of owning features end-to-end from design through deployment and operation
  • Hands-on experience with cloud data warehouses such as BigQuery or Snowflake
  • Experience designing and operating large-scale ETL/ELT pipelines
  • Production experience on GCP (preferred) or AWS
  • Hands-on experience with Kubernetes (GKE/EKS) for distributed workloads