Member of Technical Staff (ai Software Engineer, Multimodal)

Perplexity Perplexity · AI Frontier · San Francisco, CA · AI

This role is for an AI Software Engineer focused on Multimodal AI, building product experiences and platform systems that enable human-AI interaction through various modalities like voice, images, and video. The role involves working across the stack, from UI to backend infrastructure, and includes designing, building, and owning systems, leading features end-to-end, and iterating on hard problems. Experience with production systems, product judgment, and interest in multimodal AI are required.

What you'd actually do

  1. Design, build, and own product and multimodal platform systems for Perplexity.
  2. Lead features, projects and products end-to-end, from problem definition to technical design, implementation, and launch.
  3. Hill climb on hard problems, continuously iterating to improve for ourselves and customers.
  4. Partner closely with engineers, product managers, designers, data scientists, and go-to-market teams.
  5. Build systems that take into account the nuances of multimodal AI.

Skills

Required

  • Experience building and operating production systems at a meaningful scale.
  • Ability to work up and down the stack, from deep systems primitives to getting the pixels and prompts just right.
  • Strong product judgment and the ability to translate user problems into simple, effective technical solutions.
  • Genuine interest and adoption of multimodal AI products and willingness to learn quickly.
  • Ability to think through novel problems and implement companion long-term solutions that scale.

Nice to have

  • Background including work with realtime audio or video processing.
  • Experience with audio stack technologies including audio processing modules (APMs), echo cancellation, noise reduction/cancellation, automatic-gain control (AGC), etc.
  • Experience with immersive UIs integrating with realtime data.
  • Some experience or familiarity with Rust or C++

What the JD emphasized

  • build the product experiences and platform systems
  • work across the stack
  • realtime audio processing
  • evaluation systems
  • backend infrastructure
  • multimodal AI

Other signals

  • multimodal AI
  • human-AI interaction
  • voice, images, video
  • product experiences
  • platform systems
  • realtime audio processing
  • evaluation systems
  • backend infrastructure