Machine Learning Engineer - Inference

ByteDance ByteDance · Big Tech · San Jose, CA · R&D

Machine Learning Engineer focused on designing, implementing, and optimizing distributed inference infrastructure for large-scale AI models in the consumer domain, specifically for ads, feeds, and search ranking.

What you'd actually do

  1. Responsible for the design and implementation of distributed inference infrastructure for feeds, ads and search ranking models.
  2. Responsible for building monitoring/managing tools to oversee the reliability and scalability of online inference servers
  3. Responsible for triaging system inefficiency and bottlenecks and improving system performance
  4. Responsible for building tools to analyze bottlenecks and sources of instability and then design and implement solutions
  5. Responsible for collaboration with product teams and providing general solutions to meet their requirements

Skills

Required

  • design and implementation of distributed inference infrastructure
  • building monitoring/managing tools for online inference servers
  • triaging system inefficiency and bottlenecks
  • improving system performance
  • analyzing bottlenecks and sources of instability
  • collaboration with product teams
  • at least 3 years of experience in developing and deploying large-scale systems
  • experience contributing to an open sourced machine learning framework (tensorflow / jax / pytorch / torchscript / mxnet / tensorrt)
  • Strong background in one of the following fields: Hardware-Software Co-Design, High Performance Computing, ML Hardware Acceleration (e.g., GPU/RDMA) or ML for Systems

What the JD emphasized

  • large-scale systems
  • distributed inference infrastructure
  • reliability and scalability
  • system inefficiency and bottlenecks
  • improving system performance

Other signals

  • ML infrastructure
  • inference
  • large-scale systems
  • high performance computing