Software Engineer, Distributed Systems, Cluster Management, Autopilot

Google Google · Big Tech · Warsaw, Poland +1

Software Engineer role focused on designing, developing, and maintaining the Autopilot system for Google Cloud, which leverages AI to optimize memory and compute resource allocation across applications. The role involves leading the full lifecycle of new features, addressing large-scale engineering challenges for system efficiency and reliability, and collaborating with other engineering groups. Requires experience in distributed systems, C++, and data structures/algorithms, with a preference for AI application experience and operating systems internals.

What you'd actually do

  1. Design, develop, and maintain clean, reliable code to enhance the Autopilot system, leveraging AI to optimize how memory and compute resources are allocated across applications.
  2. Lead the end-to-end life-cycle of new features, from initial planning and design to launch, ensuring they operate safely and effectively in production environments.
  3. Address large-scale engineering challenges by identifying ways to increase system efficiency, enabling infrastructure to manage millions of tasks concurrently without performance degradation.
  4. Maintain system reliability by investigating software issues, debugging code, and actively monitoring operations to ensure Autopilot runs smoothly for all dependent applications.
  5. Collaborate closely with teammates and partner engineering groups to understand requirements, conduct code reviews, and share innovative ideas for building better infrastructure tools.

Skills

Required

  • C++
  • data structures
  • algorithm design
  • large-scale distributed computing systems
  • software development

Nice to have

  • Master's degree or PhD in Computer Science, Computer Engineering, or a related technical field
  • applying artificial intelligence to create a software
  • end-to-end ownership of complex technical projects
  • data analysis
  • SQL
  • Google-internal development tools
  • operating systems internals
  • resource management
  • system performance optimization

What the JD emphasized

  • 5 years of experience in software development using general-purpose programming languages, with a focus on C++
  • Experience designing, building, and maintaining large-scale distributed computing systems
  • Experience applying artificial intelligence to create a software

Other signals

  • AI to optimize memory and compute resources
  • large-scale distributed computing systems
  • system efficiency
  • system reliability