Machine Learning Engineer Ii, Computer Vision Applied Science

Pinterest Pinterest · Consumer · San Francisco, CA · ATG

Machine Learning Engineer II, Computer Vision Applied Science at Pinterest, focusing on advancing vision-centric LLMs (VLMs). The role involves prototyping new model architectures, developing evaluation benchmarks for vision-centric capabilities, contributing to generative strategy, and assisting with data collection for RLHF and fine-tuning. The goal is to enhance the core Pinterest product using large-scale generative models built on visual-text datasets.

What you'd actually do

  1. Prototype new model architectures for Pinterest VLMs. We’re looking for hands-on experience working with finetuning open-source LLM models and improve their visual perception and tool using capabilities.
  2. Develop new evaluation benchmarks that tailors to vision-centric capabilities such as fashion style recommendations.
  3. Read research papers, participate in group discussions, and help brainstorm our overall visual generative strategy at the company.
  4. Help with collection of relevant visual training data for Pinterest Canvas, particularly to conduct RLHF, targeted fine-tuning, etc.
  5. Publish and publicize your work via conferences, paper submissions, blog posts, etc.

Skills

Required

  • generative computer vision models
  • visual encoders
  • LLMs
  • 2+ years of industry computer vision experience
  • M.S. or PhD in Machine Learning, Computer Science, or related areas

Nice to have

  • Publications at top ML conferences
  • Experience using Cursor, Copilot, Codex, or similar AI coding assistants
  • Familiarity with LLM-powered productivity tools

What the JD emphasized

  • vision-centric LLMs
  • visual perception
  • tool using capabilities
  • vision-centric capabilities
  • visual training data
  • RLHF
  • targeted fine-tuning

Other signals

  • building VLMs
  • generative models for production
  • visual-text datasets
  • multimodal search
  • text-to-image models