Senior Staff Uber Technical Lead, Observability Intelligence

Google Google · Big Tech · New York, NY +1

Senior Staff Uber Technical Lead for Observability Intelligence, driving the strategic shift of SRE incident response to an AI-driven paradigm within Google Cloud's monitoring systems. This role involves leading large-scale ML infrastructure optimization, defining the Observability Intelligence strategy, representing the organization in technical reviews, and partnering with Product Management to translate product needs into scalable architectural solutions. The focus is on building a cohesive, AI-powered observability ecosystem.

What you'd actually do

  1. Drive technical project strategy, lead large-scale ML infrastructure optimization, and oversee the design and implementation of solutions across multiple specialized ML areas.
  2. Define and socialize a cohesive "Observability Intelligence" strategy that aligns with the broader Monitoring Northstar, ensuring we build shared technical concerns once and solve them for the entire organization.
  3. Represent the Observability Intelligence organization in high-stakes technical reviews and collaborate across organizational boundaries (AlertManager, AI Operations, Incident Response Management, and Site Reliability Engineering teams across all Product Areas) to drive consensus on critical observability standards
  4. Act as the primary technical partner to Product Management, translating broad product "Whats" into scalable architectural "Hows."
  5. Lead high-level design reviews that ensure technical consistency across the stack, prioritizing interoperability, reusability, and semantic cohesion.

Skills

Required

  • Bachelor’s degree or equivalent practical experience.
  • 8 years of experience in software development.
  • 7 years of experience managing technical projects, ML design, and working with industry ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning).
  • 5 years of experience with one or more of the following: Speech/audio (e.g., technology duplicating and responding to the human voice), reinforcement learning (e.g., sequential decision making), ML infrastructure, or specialization in another ML field.
  • 5 years of experience with design and architecture; and testing/launching software products.

Nice to have

  • Master’s degree or PhD in Engineering, Computer Science, or a related technical field.
  • 8 years of experience with data structures and algorithms.
  • 5 years of experience in a technical leadership role leading project teams and setting technical direction.
  • 3 years of experience working in a complex, matrixed organization involving cross-functional, or cross-business projects.
  • Familiarity with and interest in the current AI landscape (Large Language Model (LLMs), generative agents, etc).

What the JD emphasized

  • Uber Technical Lead (UTL) for Observability Intelligence
  • AI-driven paradigm
  • generational evolution of monitoring systems
  • cohesive, "Northstar" observability ecosystem
  • managing business-critical domains
  • architectural trade-offs between urgent product requirements and long-term technical durability
  • large-scale ML infrastructure optimization
  • model deployment
  • model evaluation
  • data processing
  • debugging
  • fine tuning

Other signals

  • AI-driven paradigm for incident response
  • generational evolution of monitoring systems
  • transformative shift to a cohesive observability ecosystem
  • strategic initiatives for Observability Intelligence