Research Scientist, Hci-multimodality - Interaction Perception, Pico

ByteDance · Big Tech · San Jose, CA · R&D

Research Scientist focused on developing computer vision, NLP, and LLM algorithms for next-generation VR intelligent interaction, including input methods, prediction, error correction, and multimodal fusion. The role involves delivering technical innovations, patents, and research translation, with a focus on lightweight models for VR/edge devices.

What you'd actually do

Develop computer vision-driven VR interaction algorithms centered on intelligent input method systems.
Build LLM & NLP algorithms for prediction, error correction and completion.
Research lightweight LLM/NLP and vision-language multimodal fusion for VR.
Deliver technical innovations, patents and research translation.
Provide technical leadership for the algorithm team.

Skills

Required

NLP
LLM
Transformers
Sequence Modeling
PyTorch
TensorFlow
Computer Vision
Multimodal Background

Nice to have

Intelligent Input Methods
Chinese Input Methods
LLM Fine-tuning
Lightweight Model Deployment
VR/Edge Devices
VR Interaction
Scenario Understanding

What the JD emphasized

5+ years NLP/LLM R&D experience
2+ years leading core algorithms
Expertise in Transformers, LLMs and sequence modeling
Basic computer vision or multimodal background

Other signals

LLM
NLP
Computer Vision
Multimodal Fusion
VR Interaction

Read full job description

About the Team The team at PICO is dedicated to leverage technologies such as computer vision, deep learning, SLAM, 3D reconstruction, and multi-sensor fusion, we continuously expand the ways humans interact with the virtual world through handheld controllers, bare-hand tracking, eye-tracking, and XR interactive accessories, enhancing the overall interaction experience.

We focus on computer vision + NLP + LLM for next-generation VR intelligent interaction.

Responsibilities:

Develop computer vision-driven VR interaction algorithms centered on intelligent input method systems.
Build LLM & NLP algorithms for prediction, error correction and completion.
Research lightweight LLM/NLP and vision-language multimodal fusion for VR.
Deliver technical innovations, patents and research translation.
Provide technical leadership for the algorithm team.

Requirements

Minimum Qualifications:

Master’s/PhD in CS, AI or related field.
5+ years NLP/LLM R&D experience, 2+ years leading core algorithms.
Expertise in Transformers, LLMs and sequence modeling.
Proficiency in PyTorch/TensorFlow.
Basic computer vision or multimodal background.

Preferred Qualifications

Experience in intelligent input methods, especially Chinese input methods.
LLM fine-tuning and optimization for interactive systems.
Vision-language multimodal fusion for VR.
Lightweight model deployment on VR/edge devices.
VR interaction and scenario understanding.