Engineering Internship, Enrichment and Curation

Wayve Wayve · Robotics · Sunnyvale, CA · AI Platform

Engineering Intern to work on foundation models for embodied AI, including large-scale pretraining, post-training, leveraging language, or improving reasoning capabilities. This involves training models on large-scale multimodal data and curating large multimodal datasets for training and evaluation.

What you'd actually do

  1. Work on foundation models for embodied AI, including large-scale pretraining, post-training, leveraging language, or improving reasoning capabilities.
  2. Train models on large-scale multimodal (vision, language, etc.) data efficiently in a multi-node distributed system, and evaluate their performance on open (and closed) datasets/benchmarks.
  3. Curate large multimodal datasets for training and evaluation.
  4. Lead a high-impact research work and publish at a top tier conference (e.g., CVPR, ICCV, CoRL, NeurIPS, CoLM, RSS, ICRA, among others).

Skills

Required

  • previous experience in vision-language models, large language models, natural language processing, especially around reasoning
  • prior experience in curating training data to steer the behavior of trained models
  • solid software engineering fundamentals, especially in Python
  • previously used PyTorch or a similar library for deep learning (e.g. Tensorflow, JAX)
  • Experience with multi-node distributed training of large models
  • interested in using large-scale multimodal (vision, language, etc.) datasets to improve embodied AI
  • currently pursuing a graduate degree in a Computer Science, Machine Learning, Robotics, or related technical field
  • proficient in at least one backend/systems programming language (e.g. Python, Ruby, Java, etc)

Nice to have

  • previous publications in the following conferences (e.g., CVPR, ICCV, CoRL, NeurIPS, CoLM, RSS, ICRA, among others)

What the JD emphasized

  • previous publications in the following conferences (e.g., CVPR, ICCV, CoRL, NeurIPS, CoLM, RSS, ICRA, among others)

Other signals

  • foundation models for embodied AI
  • large-scale pretraining, post-training
  • leveraging language, or improving reasoning capabilities
  • large-scale multimodal (vision, language, etc.) data
  • curate large multimodal datasets