Senior Software Engineer, Acceleration Platform

Google Google · Big Tech · Singapore

Senior Software Engineer to architect next-generation AI-native agentic systems for eliminating developer toil. Focus on defining paradigms for self-sustaining AI to resolve engineering issues, establishing AI engineering excellence and security-by-design, and scaling technical capabilities. Responsibilities include leading design of multi-agent networks, defining best practices blending distributed systems with LLM orchestration and RAG, establishing AI quality/safety strategy with automated evaluation, managing edge cases with advanced telemetry, and driving technical alignment and mentorship.

What you'd actually do

  1. Lead the design and architecture of highly scalable, fault-tolerant systems where multi-agent networks reason, plan, and execute complex workflows across vast, distributed codebases.
  2. Define best practices for the team and broader organization. Blend traditional distributed systems architecture with advanced LLM orchestration, complex Retrieval Augmented Generation (RAG) pipelines, and optimization.
  3. Establish the overarching technical strategy for AI quality and safety. Build automated evaluation frameworks that measure performance, enforce strict security standards, and reliably mitigate at scale.
  4. Manage the most intricate non-deterministic edge cases. Build advanced telemetry and introspection tooling that allows the entire organization to understand, debug, and optimize self-sustaining behavior.
  5. Drive technical alignment across local pods and global organizations. Mentor junior and mid-level engineers, translate extreme ambiguity into actionable technical roadmaps, and shape the future of AI-driven developer productivity.

Skills

Required

  • software programming in Python or C++
  • testing, maintaining, or launching software products
  • software design and architecture
  • core ML domain (generative AI, NLP, computer vision, speech/audio, reinforcement learning, recommendation systems, or ML infrastructure)
  • ML infrastructure (model training, model inference, model deployment, model evaluation, optimization, data processing, debugging)

Nice to have

  • data structures and algorithms
  • technical leadership role
  • distributed systems architecture
  • LLM capabilities, limitations, and failure modes
  • deploying and scaling enterprise-grade LLM-backend applications, RAG system, and agentic systems
  • AI safety
  • enterprise security
  • advanced prompt engineering
  • scalable model evaluation methodologies
  • drive technical consensus and engineering excellence for complex, high-ambiguity 0-to-1 initiatives across multiple teams

What the JD emphasized

  • AI-native agentic systems
  • self-sustaining AI
  • AI quality and safety
  • automated evaluation frameworks
  • non-deterministic edge cases
  • 0-to-1 initiatives

Other signals

  • AI-native agentic systems
  • eliminate systemic developer toil
  • self-sustaining AI resolves complex engineering issues
  • AI quality and safety
  • automated evaluation frameworks
  • intricate non-deterministic edge cases
  • advanced telemetry and introspection tooling
  • AI-driven developer productivity