Software Engineer

Uber Uber · Consumer · San Francisco, CA · Engineering

Software Engineer at Uber responsible for designing and developing large-scale data systems, including databases, data warehouses, and big data platforms. The role involves building robust and scalable software solutions, designing and building scalable data pipelines for batch and real-time processing using tools like Apache Airflow, Spark, SQL, Kafka, and Flink. Key responsibilities include ensuring data quality, privacy, and compliance, championing data governance, driving automation, and collaborating with cross-functional teams. The role also requires staying updated on data engineering, Gen AI, and Cloud solutions to create innovative solutions and mentoring junior engineers.

What you'd actually do

  1. Design and develop large-scale data systems, including databases, data warehouses, and big data platforms.
  2. Build robust and scalable software solutions using modern software engineering practices.
  3. Design and build scalable data pipelines for both batch and real-time processing - leveraging Apache Airflow, Spark, and SQL for ETL workflows across diverse data sources (e.g., relational databases, APIs, logs), and using tools like Apache Kafka, Flink, and Spark Structured Streaming to enable near real-time data processing for analytics and monitoring use cases.
  4. Collaborate with stakeholders to understand business needs and translate them into scalable and reliable data systems and tools, while ensuring data quality, privacy, and compliance.
  5. Champion and enforce data governance practices, including data lineage, metadata management, data quality controls, and privacy regulations.

Skills

Required

  • Python
  • Scala
  • SQL
  • ETL Data Pipelines
  • Amazon S3
  • Amazon EMR
  • Amazon Lambda
  • Amazon Redshift
  • GCP BigQuery
  • HDFS
  • Hive
  • Presto
  • Spark
  • Flink
  • Airflow
  • MapReduce
  • Apache Kafka
  • HUDI
  • Batch and Real Time Data processing
  • GenAI powered applications
  • Data structures and algorithms development
  • Designing technology stacks
  • Debugging and monitoring for production services
  • Distributed systems
  • Software Development Lifecycle

Nice to have

  • modern software engineering practices
  • data lineage
  • metadata management
  • data quality controls
  • privacy regulations
  • automation initiatives
  • scripts, utilities, and frameworks
  • cross-functional teams
  • mentor junior engineers
  • latest industry trends and technologies in data engineering, Gen AI and Cloud solutions
  • innovative solutions for complex challenges
  • comprehensive documentation
  • actively share knowledge

What the JD emphasized

  • GenAI powered applications for predictive and generative analytics, code optimization and developer productivity

Other signals

  • design and develop large-scale data systems
  • build robust and scalable software solutions
  • design and build scalable data pipelines
  • leverage Apache Airflow, Spark, and SQL for ETL workflows
  • enable near real-time data processing
  • ensure data quality, privacy, and compliance
  • champion and enforce data governance practices
  • drive automation initiatives
  • collaborate with cross-functional teams
  • mentor junior engineers
  • stay updated with latest industry trends and technologies in data engineering, Gen AI and Cloud solutions
  • create innovative solutions for complex challenges
  • maintain comprehensive documentation
  • actively share knowledge with team