Research Engineer, Sysml - Fair

Meta Meta · Big Tech · Menlo Park, CA +1 · Remote

Research Engineer in Meta's Fundamental AI Research (FAIR) team, focusing on advancing AI through open science innovations in ML systems and infrastructure. The role involves research into scalable, efficient, and sustainable AI systems, particularly enabling distributed training at scale and hardware-software co-design for performance acceleration. Collaboration with researchers and product teams is key, with an emphasis on publishing research results.

What you'd actually do

  1. Carry out cutting-edge research to advance the science and technology of machine learning systems
  2. Perform research that enables learning the semantics of data (images, video, text, audio, and other modalities)
  3. Devise better data-driven models of AI system design and optimization
  4. Contribute research that leads to innovations in: scalable machine learning systems, resource-efficient AI data and algorithm scaling and neural network architectures, memory and energy-efficient AI systems, environmentally-sustainable AI system and hardware designs
  5. Collaborate with researchers and cross-functional partners including communicating research plans, progress, and results

Skills

Required

  • Python
  • C++
  • C
  • Rust
  • PyTorch framework
  • developing and optimizing systems for at-scale machine learning execution
  • devising data-driven models and real-system experiments and design implementation for AI system optimization
  • scalable machine learning systems
  • resource-efficient AI data and algorithm scaling
  • neural network architectures
  • solving complex problems and comparing alternative solutions, tradeoffs, and different perspectives to determine a path forward
  • working and communicating cross functionally in a team environment
  • research and software engineering experience
  • integrating AI tools to optimize/redesign workflows and drive measurable impact

Nice to have

  • PhD in the field of Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • ongoing AI skill development (e.g., prompt/context engineering, agent orchestration)
  • staying current with emerging AI technologies
  • implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy reviews)

What the JD emphasized

  • advancing the science and technology of machine learning systems
  • enabling distributed training at an unprecedented scale
  • training performance acceleration through hardware-software co-design
  • scalable machine learning systems
  • resource-efficient AI data and algorithm scaling
  • neural network architectures
  • memory and energy-efficient AI systems
  • environmentally-sustainable AI system and hardware designs
  • Publish research results
  • Publications at leading workshops, journals or conferences such as MLSys, ISCA, ASPLOS, HPCA, PLDI, CGO, NeurIPS, ICML, ICLR, or similar

Other signals

  • advancing the field of artificial intelligence
  • fundamental advances in systems
  • ML systems and infrastructures at scale
  • enabling distributed training at an unprecedented scale
  • advancements and development in training library
  • training performance acceleration through hardware-software co-design