Research Scientist, Visual Data and Gen… at Google

What you'd actually do

Design and execute high-throughput strategies to capture high-quality multi-view video and image data from thousands of unique participants and environments.

Design and optimize specialized acquisition hardware and optical configurations to extract high-precision ground truth visual data for complex foreground subjects and environmental backgrounds.

Research and implement methods to fine-tune generative video foundation models on proprietary datasets to produce high-fidelity synthetic video training data, dramatically increasing model exposure to scenes.

Develop automated pipelines that generate high-resolution depth, segmentation, and motion labels from production-grade models to supervise and train next-generation research architectures.

Create rigorous, large-scale image and video evaluation datasets specifically designed to measure and solve "long-tail" quality issues, such as complex material properties and temporal stability.

Skills

Required

Python
C++
visual data acquisition and curation for 3D vision tasks
designing and training neural networks
transformers
diffusion models

Nice to have

generative video research
fine-tuning or distillation of foundation vision models for novel view synthesis
distributed training frameworks
large-scale machine learning data infrastructure
computational photography
specialized sensor calibration
active illumination
multi-modal sensor fusion
managing end-to-end visual data pipelines

What the JD emphasized

PhD or equivalent practical experience in computer vision, machine learning, computer graphics, or generative media.

Experience designing and training neural networks, specifically with transformers or diffusion models.

Proven track record of managing end-to-end visual data pipelines, from initial capture strategy to automated curation and model integration.

As an organization, Google maintains a portfolio of research projects driven by fundamental research, new product innovation, product contribution and infrastructure goals, while providing individuals and teams the freedom to emphasize specific types of work. As a Research Scientist, you'll setup large-scale tests and deploy promising ideas quickly and broadly, managing deadlines and deliverables while applying the latest theories to develop new and improved products, processes, or technologies. From creating experiments and prototyping implementations to designing new architectures, our research scientists work on real-world problems that span the breadth of computer science, such as machine (and deep) learning, data mining, natural language processing, hardware and software performance analysis, improving compilers for mobile platforms, as well as core search and much more.

As a Research Scientist, you'll also actively contribute to the wider research community by sharing and publishing your findings, with ideas inspired by internal projects as well as from collaborations with research programs at partner universities and technical institutes all over the world.

Labs is a group focused on incubating early-stage efforts in support of Google’s mission to organize the world’s information and make it universally accessible and useful. Our team exists to help discover and create new ways to advance our core products through exploration and the application of new technologies. We work to build new solutions that have the potential to transform how users interact with Google. Our goal is to drive innovation by developing new Google products and capabilities that deliver significant impact over longer timeframes.

The US base salary range for this full-time position is $147,000-$211,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

Design and execute high-throughput strategies to capture high-quality multi-view video and image data from thousands of unique participants and environments.
Design and optimize specialized acquisition hardware and optical configurations to extract high-precision ground truth visual data for complex foreground subjects and environmental backgrounds.
Research and implement methods to fine-tune generative video foundation models on proprietary datasets to produce high-fidelity synthetic video training data, dramatically increasing model exposure to scenes.
Develop automated pipelines that generate high-resolution depth, segmentation, and motion labels from production-grade models to supervise and train next-generation research architectures.
Create rigorous, large-scale image and video evaluation datasets specifically designed to measure and solve "long-tail" quality issues, such as complex material properties and temporal stability.

Qualifications

Minimum qualifications:

PhD or equivalent practical experience in computer vision, machine learning, computer graphics, or generative media.
Experience in Python and C++.
Experience with visual data acquisition and curation for 3D vision tasks, such as multi-view video, point clouds, or radiance fields.
Experience designing and training neural networks, specifically with transformers or diffusion models.

Preferred qualifications:

Experience with generative video research, including fine-tuning or distillation of foundation vision models for novel view synthesis.
Familiarity with distributed training frameworks and large-scale machine learning data infrastructure.
Expertise in computational photography or specialized sensor calibration, such as active illumination or multi-modal sensor fusion.
Proven track record of managing end-to-end visual data pipelines, from initial capture strategy to automated curation and model integration.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

Design and execute high-throughput strategies to capture high-quality multi-view video and image data from thousands of unique participants and environments.
Design and optimize specialized acquisition hardware and optical configurations to extract high-precision ground truth visual data for complex foreground subjects and environmental backgrounds.
Research and implement methods to fine-tune generative video foundation models on proprietary datasets to produce high-fidelity synthetic video training data, dramatically increasing model exposure to scenes.
Develop automated pipelines that generate high-resolution depth, segmentation, and motion labels from production-grade models to supervise and train next-generation research architectures.
Create rigorous, large-scale image and video evaluation datasets specifically designed to measure and solve "long-tail" quality issues, such as complex material properties and temporal stability.

Qualifications

Minimum qualifications:

PhD or equivalent practical experience in computer vision, machine learning, computer graphics, or generative media.
Experience in Python and C++.
Experience with visual data acquisition and curation for 3D vision tasks, such as multi-view video, point clouds, or radiance fields.
Experience designing and training neural networks, specifically with transformers or diffusion models.

Preferred qualifications:

Experience with generative video research, including fine-tuning or distillation of foundation vision models for novel view synthesis.
Familiarity with distributed training frameworks and large-scale machine learning data infrastructure.
Expertise in computational photography or specialized sensor calibration, such as active illumination or multi-modal sensor fusion.
Proven track record of managing end-to-end visual data pipelines, from initial capture strategy to automated curation and model integration.

Research Scientist, Visual Data and Generative Research

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications:

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications: