Machine Learning Engineer - Inference

ByteDance · Big Tech · San Jose, CA · R&D

Machine Learning Engineer focused on designing, implementing, and optimizing distributed inference infrastructure for large-scale AI models in the consumer domain, specifically for ads, feeds, and search ranking.

What you'd actually do

Responsible for the design and implementation of distributed inference infrastructure for feeds, ads and search ranking models.
Responsible for building monitoring/managing tools to oversee the reliability and scalability of online inference servers
Responsible for triaging system inefficiency and bottlenecks and improving system performance
Responsible for building tools to analyze bottlenecks and sources of instability and then design and implement solutions
Responsible for collaboration with product teams and providing general solutions to meet their requirements

Skills

Required

design and implementation of distributed inference infrastructure
building monitoring/managing tools for online inference servers
triaging system inefficiency and bottlenecks
improving system performance
analyzing bottlenecks and sources of instability
collaboration with product teams
at least 3 years of experience in developing and deploying large-scale systems
experience contributing to an open sourced machine learning framework (tensorflow / jax / pytorch / torchscript / mxnet / tensorrt)
Strong background in one of the following fields: Hardware-Software Co-Design, High Performance Computing, ML Hardware Acceleration (e.g., GPU/RDMA) or ML for Systems

What the JD emphasized

large-scale systems
distributed inference infrastructure
reliability and scalability
system inefficiency and bottlenecks
improving system performance

Other signals

ML infrastructure
inference
large-scale systems
high performance computing

Read full job description

The mission of our AML team is to push the next-generation AI infrastructure and recommendation platform for the ads ranking, search ranking, live & ecom ranking in our company. We also drive substantial impact on core businesses of the company. Currently, we are looking for Machine Learning Engineer - Inference to join our team to support and advance that mission.

Responsibilities:

Responsible for the design and implementation of distributed inference infrastructure for feeds, ads and search ranking models.
Responsible for building monitoring/managing tools to oversee the reliability and scalability of online inference servers
Responsible for triaging system inefficiency and bottlenecks and improving system performance
Responsible for building tools to analyze bottlenecks and sources of instability and then design and implement solutions
Responsible for collaboration with product teams and providing general solutions to meet their requirements

Requirements

Minimum Qualifications:

At least 3 years of experience in developing and deploying large-scale systems.
Experience contributing to an open sourced machine learning framework (tensorflow / jax / pytorch / torchscript / mxnet / tensorrt).
Strong background in one of the following fields: Hardware-Software Co-Design, High Performance Computing, ML Hardware Acceleration (e.g., GPU/RDMA) or ML for Systems.