Forward Deployed Architect, Generative Ai, Google Cloud

Google Google · Big Tech · Beijing, China +2

The Generative AI Forward Deployed Architect will build and deploy reference agentic solutions for enterprise customers on Google Cloud. This role involves integrating AI models with existing infrastructure, addressing data readiness and state management issues, and building evaluation pipelines for accuracy, safety, and latency. The architect will also act as a feedback loop to inform Google Cloud's product roadmap.

What you'd actually do

  1. Serve as a developer of complex reference solutions to enable customers to deploy Google’s latest and most advanced technologies.
  2. Architect and develop reference prototypes being the connective tissue between Google’s advanced cloud solutions and customer's live infrastructure, including APIs, legacy data silos, and security perimeters as part of an expert team.
  3. Build high-performance evaluation pipelines and observability frameworks to ensure agentic systems meet requirements for accuracy, safety and latency.
  4. Identify repeatable field patterns and friction points within existing Google solutions, converting them into reusable modules or formal product feature requests for the Engineering teams.
  5. Collaborate with Solution Architect teams to instill Google-grade development best practices, ensuring long-term project success and high end-user adoption.

Skills

Required

  • Python
  • Typescript
  • building pipelines for structured, unstructured data
  • vector databases
  • Retrieval-Augmented Generation (RAG)
  • technical discovery sessions
  • architecting technology solutions
  • data sovereignty
  • GDPR compliance
  • secure model governance

Nice to have

  • Master’s degree or PhD in Computer Science
  • architecting integrated systems
  • real-time inference constraints
  • model quantization
  • optimizing state management
  • granular tracing
  • model serving metrics
  • scaling production-grade ML systems
  • workflow pipelines
  • CI/CD/CT automation
  • experimentation
  • GenMedia models
  • fine-tuning capability
  • image generation
  • video generation
  • audio generation

What the JD emphasized

  • building and shipping production-grade solutions
  • incorporating vector databases and Retrieval-Augmented Generation (RAG) like architectures
  • data sovereignty, GDPR compliance, and secure model governance
  • real-time inference constraints
  • model quantization
  • optimizing state management and granular tracing
  • content generation at scale
  • scaling production-grade ML systems
  • fine-tuning capability

Other signals

  • building reference agentic solutions
  • integrating with customer infrastructure
  • feedback loop to product roadmap
  • building evaluation pipelines
  • optimizing inference and model serving