AI Research Scientist, Sysml - Fair

Meta Meta · Big Tech · Menlo Park, CA +2

Research Engineer role focused on advancing the field of AI by making fundamental advances in systems and infrastructure for large-scale machine learning, with a focus on enabling distributed training, resource efficiency, and hardware-software co-design. The role involves cutting-edge research, publishing results, and impacting Meta's product development.

What you'd actually do

  1. Carry out cutting-edge research to advance the science and technology of machine learning systems
  2. Perform research that enables learning the semantics of data (images, video, text, audio, and other modalities)
  3. Contribute research that leads to innovations in: scalable machine learning systems, resource-efficient AI data and algorithm scaling and neural network architectures, memory and energy-efficient AI systems, environmentally-sustainable AI system and hardware designs
  4. Devise better data-driven models of AI system design and optimization
  5. Collaborate with researchers and cross-functional partners including communicating research plans, progress, and results

Skills

Required

  • Python
  • C++
  • C
  • Rust
  • PyTorch
  • systems
  • computer architectures
  • compiler and programming languages
  • machine learning
  • artificial intelligence
  • developing and optimizing systems for at-scale machine learning execution
  • devising data-driven models and real-system experiments and design implementation for AI system optimization
  • scalable machine learning systems
  • resource-efficient AI data and algorithm scaling
  • neural network architectures
  • solving complex problems
  • comparing alternative solutions, tradeoffs, and different perspectives to determine a path forward
  • working and communicating cross functionally in a team environment

Nice to have

  • PhD degree
  • cuBLAS
  • cuDNN
  • FlashAttention

What the JD emphasized

  • unprecedented scale
  • human-level intelligence
  • open science innovations
  • usability, efficiency, and sustainability as design principles
  • enabling distributed training at an unprecedented scale
  • advancements and development in training library and authoring components
  • training performance acceleration through hardware-software co-design
  • learning the semantics of data (images, video, text, audio, and other modalities)
  • scalable machine learning systems
  • resource-efficient AI data and algorithm scaling
  • neural network architectures
  • memory and energy-efficient AI systems
  • environmentally-sustainable AI system and hardware designs
  • data-driven models of AI system design and optimization
  • Publish research results
  • impacts Meta product development
  • equivalent practical experience
  • systems, computer architectures, compiler and programming languages, machine learning, and artificial intelligence
  • developing and optimizing systems for at-scale machine learning execution
  • devising data-driven models and real-system experiments and design implementation for AI system optimization
  • solving complex problems
  • comparing alternative solutions, tradeoffs, and different perspectives to determine a path forward
  • working and communicating cross functionally in a team environment
  • Proven track record of achieving significant results and publications
  • publications at leading workshops, journals or conferences such as MLSys, ISCA, ASPLOS, HPCA, PLDI, CGO, NeurIPS, ICML, ICLR, or similar
  • Demonstrated research and software engineering experience via work experience, coding competitions, or widely used contributions in open source repositories (e.g. GitHub)

Other signals

  • advancing the field of artificial intelligence
  • making fundamental advances in technologies to help interact with and understand our world
  • advancing the state of AI through open science innovations
  • explore, design, and build ML systems and infrastructures at scale