Research Scientist

Snorkel AI Snorkel AI · Data AI · Redwood City, CA +1 · Remote · 316 - Research

Research Scientist role focused on designing, implementing, and validating novel AI techniques for data development, such as synthetic data generation using LLM as a Judge. The role involves prototyping and building end-to-end workflows, integrating research ideas into scalable systems, and collaborating with partners to productionize prototypes. Emphasis on applied research and system-building in an enterprise AI context.

What you'd actually do

  1. Design, implement, and validate novel AI techniques for data development** **such as synthetic data generation, utilizing techniques such as LLM as a Judge
  2. Prototype and build end-to-end workflows, integrating research ideas into scalable systems.
  3. Write high-quality, maintainable code, ensuring robust implementation of research-driven innovations.
  4. Move fast and adapt—iterating on solutions in response to new challenges, customer needs, and emerging research.
  5. Work closely with real-world design partners, testing solutions in applied settings with measurable impact.

Skills

Required

  • Python
  • machine learning frameworks (NumPy, Scikit-learn, Pandas, PyTorch, TensorFlow, etc.)
  • software engineering best practices (e.g., clean coding, modular design, version control)
  • ML infrastructure
  • cloud platforms (AWS, Google Cloud)
  • accelerators (GPUs, TPUs)

Nice to have

  • Ph.D. in machine learning or a related field with a strong publication record
  • AI, NLP, multi-modal models, LLMs, and generative AI
  • developing, experimenting, and deploying AI models at scale

What the JD emphasized

  • novel AI techniques for data development
  • synthetic data generation
  • LLM as a Judge
  • prototype and build end-to-end workflows
  • integrating research ideas into scalable systems
  • applied research
  • system-building
  • productionize prototypes

Other signals

  • transform expert knowledge into specialized AI at scale
  • build custom AI with their data faster than ever before
  • prototype, build, and deploy innovative AI solutions
  • translating research breakthroughs into scalable, practical applications