Software Engineer, Distributed Data Systems - Robotics

OpenAI OpenAI · AI Frontier · San Francisco, CA · Research

Software Engineer to design and scale infrastructure for large-scale multimodal training and evaluation in robotics at OpenAI. Focus on distributed data pipelines, ML infrastructure, and ensuring scalability and reliability.

What you'd actually do

  1. Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, streaming infrastructure, machine learning infrastructure while ensuring scalability, reliability, and security.
  2. Ensure our data platform can scale by orders of magnitude while remaining reliable and efficient
  3. Partner with researchers to deeply understand requirements and translate them into production-ready systems.
  4. Harden, optimize, and maintain critical data infrastructure systems that power multimodal training and evaluation.

Skills

Required

  • distributed systems
  • large-scale infrastructure
  • data infrastructure
  • software engineering fundamentals
  • organizational skills

Nice to have

  • detail-oriented
  • rigor
  • ambiguity
  • rapid change

What the JD emphasized

  • scale
  • reliable
  • multimodal training and evaluation

Other signals

  • large-scale multimodal training
  • distributed data systems
  • ML infrastructure