Software Engineer Iii, Multimodal Agentic Ai, Xr

Google Google · Big Tech · San Jose, CA +1

This role focuses on designing, developing, and deploying multimodal agentic AI solutions for smart glasses, leveraging Gemini Live and Astra. It involves enhancing multimodal tools, defining strategies for data, evaluation, and post-tuning of Gemini models, and optimizing agent architecture for inference cost. The role emphasizes AI quality for production systems, including evaluation frameworks and model improvements, with a focus on multimodal conversational quality, tool use, and goal-oriented reasoning.

What you'd actually do

  1. Design, develop, and deploy scalable and agentic AI solutions for high-value, real-world multimodal conversational AI use cases on smart glasses.
  2. Gain an understanding of the Gemini Live and Astra tech stack and infrastructure. Optimize agent architecture/orchestration to ensure efficient deployment and operation at scale, with a focus on inference cost optimization.
  3. Take ownership of AI quality for production systems. This includes defining technical metrics, implementing evaluation frameworks, analyzing loss patterns, and driving improvements through data collection and smart data generation and model enhancements.
  4. Implement, optimize, and advance AI techniques, with a focus on multimodal conversational quality, multimodal tool use, and multimodal goal-oriented reasoning.

Skills

Required

  • software development in Python or C++
  • ML infrastructure (e.g., model deployment, model evaluation, optimization, data processing, debugging)
  • GenAI techniques (e.g., LLMs, Multi-Modal, Large Vision Models) or with GenAI-related concepts (language modeling, computer vision)

Nice to have

  • data structures and algorithms
  • applied research to enable new functionality and improve the quality and efficiency of large language and multimodal models
  • machine learning and statistics

What the JD emphasized

  • scalable and agentic AI solutions
  • multimodal conversational AI use cases
  • agent architecture/orchestration
  • inference cost optimization
  • AI quality for production systems
  • evaluation frameworks
  • multimodal conversational quality
  • multimodal tool use
  • multimodal goal-oriented reasoning

Other signals

  • building agentic AI solutions
  • multimodal experience
  • Gemini Live and Astra
  • smart glasses
  • goal-oriented reasoning tasks
  • multimodal tools and extensions
  • data, evaluation, and post-tuning of the Gemini model
  • AI and XR convergence
  • augment human intelligence
  • personalized, conversational, and contextually aware experiences
  • scalable and agentic AI solutions
  • real-world multimodal conversational AI use cases on smart glasses
  • agent architecture/orchestration
  • inference cost optimization
  • AI quality for production systems
  • evaluation frameworks
  • model enhancements
  • multimodal conversational quality
  • multimodal tool use
  • multimodal goal-oriented reasoning