What you'd actually do

Owner of the core data pipelines in mapping, responsible for scaling up data processing flow to meet the rapid data growth at Lyft

Develop strong subject matter expertise in the systems you manage, setting and managing SLAs for both data pipeline and datasets

Continuously evolve data models and schemas to meet business and engineering requirements

Develop tools that support self-service management of data pipelines (ETL), schema evolution, and perform SQL tuning to optimize data processing performance

Write clean, well-tested, and maintainable code, prioritizing scalability and cost efficiency

Skills

Required

Spark
Python
SQL
Data Pipelines
ETL
Data Modeling
Cloud (AWS)
Kubernetes
Airflow
Database technologies (S3, DynamoDB, HDFS, Hive, Presto, Pig, HBase, Parquet, Iceberg, Flink, Spark Streaming, Kafka)
Data quality tools (Great Expectations, dbt, Monte Carlo, Soda, Collibra)
Geospatial data querying
Performance tuning
Workflow management tools (Airflow, Oozie, Azkaban, UC4, Prefect)
Infrastructure tooling (Terraform, Cloud Formation, Docker, Kubernetes, Ansible, Chef, Puppet)
API schema definition
Backend services development

Nice to have

Ruby
Bash
MySQL
PostgreSQL
SqlServer
Oracle

What the JD emphasized

4+ years of relevant professional experience

Strong experience with Spark

Experience with disparate database, querying and streaming technologies such as S3, DynamoDB, HDFS, Hive, Presto, Pig, HBase, Parquet, Iceberg, Flink, Spark Streaming, Kafka

Experience with data quality tools such as Great Expectations, dbt, Monte Carlo, Soda, Collibra

Strong understanding of SQL Engine, experience with querying geospatial data, and able to conduct advanced performance tuning

At Lyft, our purpose is to serve and connect. We aim to achieve this by cultivating a work environment where all team members belong and have the opportunity to thrive.

As a Data Engineer on the Mapping team, you will collaborate with our world class team of engineers, product managers, and data scientists to grow and improve the quality of recommended routes and accuracy of our travel time estimations. You will architect, build, and maintain scalable data pipelines and services to support route simulation, experimentation, analytics, and machine learning models. Your work will enable integration with partner teams and allow stakeholders across Engineering, Data Science, and Product to make data-informed decisions that directly impact Lyft’s growth and profitability.

Our technology stack is based on the latest technologies such as AWS, Kubernetes and Apache Airflow. You will work with incredibly passionate and talented colleagues from software engineering, machine learning and data science on projects that directly impact millions of riders and drivers.

Responsibilities

Owner of the core data pipelines in mapping, responsible for scaling up data processing flow to meet the rapid data growth at Lyft
Develop strong subject matter expertise in the systems you manage, setting and managing SLAs for both data pipeline and datasets
Continuously evolve data models and schemas to meet business and engineering requirements
Develop tools that support self-service management of data pipelines (ETL), schema evolution, and perform SQL tuning to optimize data processing performance
Write clean, well-tested, and maintainable code, prioritizing scalability and cost efficiency
Conduct code reviews to uphold code quality standards and facilitate knowledge sharing
Participate in on-call rotations to maintain high availability and reliability of data workflows
Unblock, support and communicate with internal & external partners to achieve results

Experiences

Bachelor's degree in Computer Science, Engineering, Mathematics, Statistics, or a related field
4+ years of relevant professional experience
Strong experience with Spark and with a scripting language (Python, Ruby, Bash)
Exceptional analytical skills, with an in-depth understanding of data quality issues and the ability to dive into complex datasets to resolve inconsistencies
Excellent communication skills, with the ability to articulate technical concepts clearly to both technical and non-technical audiences while collaborating effectively across teams
Experience with disparate database, querying and streaming technologies such as S3, DynamoDB, HDFS, Hive, Presto, Pig, HBase, Parquet, Iceberg, Flink, Spark Streaming, Kafka
Experience with data quality tools such as Great Expectations, dbt, Monte Carlo, Soda, Collibra
Strong understanding of SQL Engine, experience with querying geospatial data, and able to conduct advanced performance tuning
Proficient in at least one of the SQL languages (MySQL, PostgreSQL, SqlServer, Oracle)
Experience with workflow management tools (Airflow, Oozie, Azkaban, UC4, Prefect)
Experience with infra tooling (Terraform, Cloud Formation, Docker, Kubernetes, Ansible, Chef, Puppet etc preferably in an AWS context)
Experience defining API schemas and developing backend services in a microservices environment

Benefits

Extended health and dental coverage options, along with life insurance and disability benefits
Mental health benefits
Family building benefits
Child care and pet benefits
Access to a Lyft funded Health Care Savings Account
RRSP plan with company match to help save for your future
In addition to provincial observed holidays, salaried team members are covered under Lyft's flexible paid time off policy. The policy allows team members to take off as much time as they need (with manager approval). Hourly team members get 15 days paid time off, with an additional day for each year of service
Lyft is proud to support new parents with 18 weeks of paid time off, designed as a top-up plan to complement provincial programs. Biological, adoptive, and foster parents are all eligible.
Subsidized commuter benefits and Lyft ride credits

Lyft is committed to creating an inclusive workforce that fosters belonging. Lyft believes that every person has a right to equal employment opportunities without discrimination because of race, ancestry, place of origin, colour, ethnic origin, citizenship, creed, sex, sexual orientation, gender identity, gender expression, age, marital status, family status, disability, pardoned record of offences, or any other basis protected by applicable law or by Company policy. Lyft also strives for a healthy and safe workplace and strictly prohibits harassment of any kind. Accommodation for persons with disabilities will be provided upon request in accordance with applicable law during the application and hiring process. Please contact your recruiter if you wish to make such a request.

Lyft highly values having employees working in-office to foster a collaborative work environment and company culture. This role will be in-office on a hybrid schedule — Team Members will be expected to work in the office at least 3 days per week, including on Mondays, Wednesdays, and Thursdays. Lyft considers working in the office at least 3 days per week to be an essential function of this hybrid role. Your recruiter can share more information about the various in-office perks Lyft offers. Additionally, hybrid roles have the flexibility to work from anywhere for up to 4 weeks per year. #Hybrid

The expected base pay range for this position in the Toronto area is CAD $108,000 - CAD $135,000, not inclusive of potential equity offering, bonus or benefits. Salary ranges are dependent on a variety of factors, including qualifications, experience and geographic location. Your recruiter can share more information about the salary range specific to your working location and other factors during the hiring process.

Lyft may use artificial intelligence to screen applicants, however, Lyft employees make the ultimate selection and hiring decisions.

This job fills an existing vacancy.

Responsibilities

Owner of the core data pipelines in mapping, responsible for scaling up data processing flow to meet the rapid data growth at Lyft

Develop strong subject matter expertise in the systems you manage, setting and managing SLAs for both data pipeline and datasets

Continuously evolve data models and schemas to meet business and engineering requirements

Develop tools that support self-service management of data pipelines (ETL), schema evolution, and perform SQL tuning to optimize data processing performance

Write clean, well-tested, and maintainable code, prioritizing scalability and cost efficiency

Conduct code reviews to uphold code quality standards and facilitate knowledge sharing

Participate in on-call rotations to maintain high availability and reliability of data workflows

Unblock, support and communicate with internal & external partners to achieve results

Experiences

Bachelor's degree in Computer Science, Engineering, Mathematics, Statistics, or a related field

4+ years of relevant professional experience

Strong experience with Spark and with a scripting language (Python, Ruby, Bash)

Exceptional analytical skills, with an in-depth understanding of data quality issues and the ability to dive into complex datasets to resolve inconsistencies

Excellent communication skills, with the ability to articulate technical concepts clearly to both technical and non-technical audiences while collaborating effectively across teams

Experience with disparate database, querying and streaming technologies such as S3, DynamoDB, HDFS, Hive, Presto, Pig, HBase, Parquet, Iceberg, Flink, Spark Streaming, Kafka

Experience with data quality tools such as Great Expectations, dbt, Monte Carlo, Soda, Collibra

Strong understanding of SQL Engine, experience with querying geospatial data, and able to conduct advanced performance tuning

Proficient in at least one of the SQL languages (MySQL, PostgreSQL, SqlServer, Oracle)

Experience with workflow management tools (Airflow, Oozie, Azkaban, UC4, Prefect)

Experience with infra tooling (Terraform, Cloud Formation, Docker, Kubernetes, Ansible, Chef, Puppet etc preferably in an AWS context)

Experience defining API schemas and developing backend services in a microservices environment

Benefits

Extended health and dental coverage options, along with life insurance and disability benefits

Mental health benefits

Family building benefits

Child care and pet benefits

Access to a Lyft funded Health Care Savings Account

RRSP plan with company match to help save for your future

In addition to provincial observed holidays, salaried team members are covered under Lyft's flexible paid time off policy. The policy allows team members to take off as much time as they need (with manager approval). Hourly team members get 15 days paid time off, with an additional day for each year of service

Lyft is proud to support new parents with 18 weeks of paid time off, designed as a top-up plan to complement provincial programs. Biological, adoptive, and foster parents are all eligible.

Subsidized commuter benefits and Lyft ride credits

Lyft may use artificial intelligence to screen applicants, however, Lyft employees make the ultimate selection and hiring decisions.

This job fills an existing vacancy.