Staff Software Engineer, Model Quality

Google Google · Big Tech · New York, NY +1

Staff Software Engineer for Google Pics, an AI-powered visual editor, focusing on building and improving automated evaluation systems for generative AI models. The role involves establishing metrics, running evaluations, providing insights for model quality improvement, and creating tools to enhance the evaluation process, with a roadmap towards a 2026 launch.

What you'd actually do

  1. Build both human-powered and Large Language Model (LLM)-powered automated evaluation systems to assess model performance.
  2. Establish clear metrics to measure aspects like grounding, coherence, safety, and helpfulness.
  3. Utilize platforms and tools to efficiently run evaluations across different models and datasets.
  4. Provide actionable insights from evaluations to improve model quality, often in collaboration with research, and cross-functional teams.
  5. Create tools and systems that make the evaluation process more efficient and effective.

Skills

Required

  • software development
  • Speech/audio
  • reinforcement learning
  • ML infrastructure
  • ML design
  • model deployment
  • model evaluation
  • data processing
  • debugging
  • fine tuning
  • testing
  • launching software products
  • software design
  • architecture
  • integrating generative AI tools
  • Large Language Model (LLM) interfaces

Nice to have

  • Master’s degree or PhD in Engineering, Computer Science, or a related technical field
  • data structures and algorithms
  • technical leadership role leading project teams and setting technical direction
  • working in a complex, matrixed organization involving cross-functional, or cross-business projects

What the JD emphasized

  • automated evaluation systems
  • model performance
  • grounding, coherence, safety, and helpfulness
  • evaluations
  • model quality
  • evaluation process

Other signals

  • building AI-powered visual editor
  • executing against a 2026 roadmap
  • building on limited testing and consumer experiments
  • General Availability (GA) launch in 2026
  • fast pace against a comprehensive roadmap of model capabilities