Senior Data Engineer

Unity Unity · Enterprise · Mountain View, CA · Engineering

Senior Data Engineer to architect and implement distributed data systems for a near real-time reporting platform at Unity. The role involves designing and building high-throughput, low-latency data processing pipelines using technologies like Apache Flink, Spark, and Airflow, focusing on correctness, reliability, and scalability in a production environment.

What you'd actually do

  1. Design and implement near real-time data pipelines and reporting infrastructure.
  2. Architect distributed stream and batch processing systems using technologies such as Apache Flink, Spark, and Airflow.
  3. Build and maintain data processing frameworks that handle large-scale event ingestion and transformation with strong correctness guarantees.
  4. Ensure production-grade reliability, observability, and operability across distributed systems.
  5. Define and enforce data processing semantics including: Exactly-once processing, Event time vs. processing time handling, Stateful stream management, Backpressure and fault tolerance strategies.

Skills

Required

  • Strong foundation in distributed systems and systems design.
  • Hands-on experience building and operating large-scale data processing systems.
  • Deep understanding of streaming concepts: Exactly-once semantics, Watermarking and event-time processing, Stateful stream processing, Checkpointing and recovery, Backpressure handling
  • Production experience with frameworks such as Apache Flink, Spark, Kafka, or similar technologies.
  • Proficiency in Python, Java, or Scala.
  • Experience with workflow orchestration tools (e.g., Airflow) for stream and batch coordination.
  • Strong understanding of cloud-native architectures and distributed infrastructure (Kubernetes, containerization, cloud platforms).

Nice to have

  • Ability to lead architecture across teams and influence platform strategy.
  • Experience defining technical roadmaps for distributed data systems.
  • Track record of driving reliability improvements and scaling systems to high throughput environments.
  • Mentorship and technical leadership across engineering teams.

What the JD emphasized

  • high-throughput
  • low-latency
  • correctness
  • reliability
  • scalability
  • production ownership
  • Exactly-once processing
  • Event time vs. processing time handling
  • Stateful stream management
  • Backpressure and fault tolerance strategies