Senior Software Engineer, Managed Spark… at Google

What you'd actually do

Build customer-facing features for Managed Service for Apache Spark (formerly Dataproc) to run Spark in the cloud.

Drive technical design and execution for performance and lakehouse features and enhancements.

Enhance Apache Spark and lakehouse technologies like Iceberg or Delta Lake for performance, reliability, security, and monitoring.

Contribute to documentation or educational content based on product updates and user feedback, and extend open-source technologies like Apache Spark, Flink, Hive, and Trino to improve debuggability, observability, and supportability.

Review code developed by other developers and provide feedback to ensure style guidelines, code check-in, accuracy, testability, and efficiency.

Skills

Required

5 years of experience designing, analyzing and troubleshooting large-scale distributed systems.
5 years of programming experience in Java, C++ or Golang.
Experience developing with Spark, Hive, or similar engines.
Experience in benchmarking and building custom benchmarks.
Experience in developing cloud or software as a service (SaaS) products.

Nice to have

Master’s degree or PhD in Computer Science or a related technical field.
Experience with Data lakes like Apache Iceberg, Apache Hudi, Delta lake etc.
Experience with Database optimizations - query and executor optimizations.
Experience working with data science tools such as Jupyter notebooks.
Experience with Open Telemetry, JMX and other monitoring solutions.
Contributions to Apache or other similar open-source projects such as Iceberg, Delta, Hudi, Spark, Presto, Flink etc.

Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google’s needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward. Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s cutting-edge technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.Individual pay is determined by factors including job-related skills, experience, and relevant education or training.

US: $174000 - $253000 (USD) + 15% bonus target + bonus + equity + benefits

Learn more about benefits at Google.

Responsibilities

Build customer-facing features for Managed Service for Apache Spark (formerly Dataproc) to run Spark in the cloud.
Drive technical design and execution for performance and lakehouse features and enhancements.
Enhance Apache Spark and lakehouse technologies like Iceberg or Delta Lake for performance, reliability, security, and monitoring.
Contribute to documentation or educational content based on product updates and user feedback, and extend open-source technologies like Apache Spark, Flink, Hive, and Trino to improve debuggability, observability, and supportability.
Review code developed by other developers and provide feedback to ensure style guidelines, code check-in, accuracy, testability, and efficiency.

Qualifications

Minimum qualifications:

Bachelor’s degree or equivalent practical experience.
5 years of experience designing, analyzing and troubleshooting large-scale distributed systems.
5 years of programming experience in Java, C++ or Golang.
Experience developing with Spark, Hive, or similar engines.
Experience in benchmarking and building custom benchmarks.
Experience in developing cloud or software as a service (SaaS) products.

Preferred qualifications:

Master's degree or PhD in Computer Science or a related technical field.
Experience with Data lakes like Apache Iceberg, Apache Hudi, Delta lake etc.
Experience with Database optimizations - query and executor optimizations.
Experience working with data science tools such as Jupyter notebooks.
Experience with Open Telemetry, JMX and other monitoring solutions.
Contributions to Apache or other similar open-source projects such as Iceberg, Delta, Hudi, Spark, Presto, Flink etc.