2026 Applied Science Internship - United States, Phd Student Science Recruiting, Frontier AI & Robotics

Amazon Amazon · Big Tech · San Francisco, CA · Applied Science

This internship focuses on developing novel algorithms at the intersection of LLMs and generative AI for robotics, involving research in robotic perception, manipulation, and control, with an emphasis on multimodal models and vision-language-action systems.

What you'd actually do

  1. Develop novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of LLMs and generative AI for robotics
  2. Tackle challenging, groundbreaking research problems on production-scale data, with a focus on robotic perception, manipulation, and control
  3. Collaborate with cross-functional teams to solve complex business problems, leveraging your expertise in areas such as deep learning, reinforcement learning, computer vision, and motion planning
  4. Demonstrate the ability to work independently, thrive in a fast-paced, ever-changing environment, and communicate effectively with diverse stakeholders

Skills

Required

  • PhD student
  • Machine learning
  • Deep learning
  • Robotics
  • Python
  • PyTorch or JAX
  • Problem-solving skills
  • Attention to detail
  • Collaborative work

Nice to have

  • Reinforcement learning
  • Computer vision
  • Motion planning
  • Multimodal LLMs
  • World models
  • Image/video tokenization
  • Real2Sim/Sim2real transfer
  • Bimanual manipulation
  • Open-vocabulary panoptic scene understanding
  • Scaling up multi-modal LLMs
  • End-to-end vision-language-action models
  • Java
  • C++
  • Experimental design
  • Statistical analysis

What the JD emphasized

  • Publication record at science conferences such as NeurIPS, CVPR, ICRA, RSS, CoRL, and ICLR
  • Have publications at top-tier peer-reviewed conferences or journals

Other signals

  • develop novel algorithms for LLMs and generative AI in robotics
  • research problems on robotic perception, manipulation, and control
  • multimodal LLMs, world models, vision-language-action models