Currently tracking 489 active AI roles, up 170% versus the prior 4 weeks. Primary focus: Agent · Engineering. Salary range $98k–$505k (avg $233k).
| Title | Stage | AI score |
|---|---|---|
| Senior Staff Research Engineer, DeepMind Senior Staff Research Engineer at Google DeepMind focused on Agent Evals and Quality for GenAI model improvement and product development. The role involves developing, evaluating, and optimizing LLM-based agents for complex, multi-step tasks. Responsibilities include constructing quantitative benchmarks and automated evaluation frameworks (e.g., LLM-as-a-judge) to measure agent capabilities in reasoning, planning, and tool use, as well as creating and optimizing data mixes from user feedback for training and fine-tuning agents. The role also requires analyzing agent behavior to identify failure modes and performance bottlenecks. | Eval GateAgent | 9 |
| Senior Quality Engineer, Gemini Enterprise Quality Senior Quality Engineer for Gemini Enterprise Quality at Google Cloud AI Research. This role involves designing and implementing ML solutions, leveraging ML infrastructure, and focusing on quality assurance for AI products, particularly in specialized ML areas like speech/audio or reinforcement learning. The role requires experience in ML infrastructure, including model deployment and evaluation, and contributes to bringing AI innovations to real-world impact. |
| Eval GateServe |
| 7 |
| Senior Staff Uber Technical Lead, Observability Intelligence Senior Staff Uber Technical Lead for Observability Intelligence, driving the strategic shift of SRE incident response to an AI-driven paradigm within Google Cloud's monitoring systems. This role involves leading large-scale ML infrastructure optimization, defining the Observability Intelligence strategy, representing the organization in technical reviews, and partnering with Product Management to translate product needs into scalable architectural solutions. The focus is on building a cohesive, AI-powered observability ecosystem. | Eval GateServe | 7 |
| Software Engineer III, Skills Evaluation, Chrome Software Engineer III role focused on building and maintaining evaluation pipelines, safety classifiers, and automated testing systems for AI skills within the Chrome product. This involves designing and implementing metrics, visualization tools, and auto-raters to ensure the quality, safety, and performance of AI workflows, with a focus on integrating with various AI models and browser surfaces. | Eval GatePost-train | 7 |
| Staff Software Engineer, Agentic Data and Evals Staff Software Engineer focused on building and launching tools and solutions for GenAI data generation and evaluations. The role involves developing a self-service data generation platform, performing LLM/GenAI model evaluations, and fine-tuning models using techniques like RLHF. The engineer will work cross-functionally to deliver high-quality data sets and evaluation infrastructure for various GenAI use cases. | Eval GatePost-train | 7 |
| Senior Data Scientist, Core Ranking and AI Context Senior Data Scientist role focused on Core Ranking and AI Context Engineering (CRAFT) for Google Search, AI Overview, and AI Mode products. The role involves identifying quality and metric headroom, conducting analyses, applying statistical/AI methods, developing and automating evals and measurements for iterative improvements, and partnering with engineering and product teams to drive system changes and launches. The position requires a Master's degree in a quantitative field and 5 years of experience in analytics and coding, with preferred experience in consumer-facing products and evaluation methodologies. | Eval GateShip | 7 |