ByteDance — AI hiring signals

Currently tracking 106 active AI roles, with 26 new openings in the last 4 weeks. Primary focus: Serve · Engineering.

Hiring

106 / 106

Momentum (4w)

↑+1 +4%

26 opens last 4w · 25 prior 4w

Salary range

—

Tracked since

Feb '21

last role today

Hiring velocityscroll left for older weeks

1 new role

Jul 16

1 new role

Dec 30

1 new role

Jan 6

1 new role

Mar 9

2 new roles

Feb 22

1 new role

May 17

1 new role

Jun 21

1 new role

Aug 2

1 new role

Nov 1

1 new role

Feb 21

1 new role

Apr 4

1 new role

May 16

2 new roles

Jun 27

1 new role

Jul 4

1 new role

Dec 5

1 new role

Jan 2

2 new roles

1 new role

Feb 6

1 new role

Mar 20

1 new role

Apr 24

1 new role

May 15

1 new role

2 new roles

Jul 17

1 new role

Aug 7

1 new role

2 new roles

Sep 4

1 new role

Oct 23

1 new role

Nov 13

2 new roles

Dec 11

1 new role

Apr 8

1 new role

2 new roles

May 13

1 new role

Jun 24

1 new role

Jul 8

2 new roles

Aug 26

1 new role

Sep 2

1 new role

3 new roles

Nov 4

1 new role

2 new roles

Dec 9

1 new role

3 new roles

1 new role

Jan 6

2 new roles

1 new role

Feb 3

1 new role

2 new roles

Mar 3

2 new roles

5 new roles

4 new roles

Apr 7

1 new role

9 new roles

1 new role

May 19

4 new roles

6 new roles

Jun 2

1 new role

3 new roles

1 new role

2 new roles

Jul 7

3 new roles

1 new role

Aug 4

6 new roles

3 new roles

4 new roles

3 new roles

Sep 1

2 new roles

3 new roles

15 new roles

1 new role

Oct 13

2 new roles

1 new role

2 new roles

Nov 3

1 new role

2 new roles

1 new role

Dec 8

3 new roles

1 new role

3 new roles

2 new roles

Jan 5

1 new role

5 new roles

3 new roles

4 new roles

Feb 2

17 new roles

2 new roles

13 new roles

Mar 2

4 new roles

2 new roles

3 new roles

10 new roles

9 new roles

Apr 6

3 new roles

14 new roles

1 new role

5 new roles

May 4

6 new roles

Jobs (40)

106 AI · 282 total active

Title	Stage	Function	Location	First seen	AI score
Research Engineer - LLM/VLM Inference Optimization (Seed Infra) Research Engineer focused on optimizing LLM/VLM inference systems, including inference engines, serving frameworks, and deployment pipelines. Requires expertise in performance optimization techniques, C/C++, Python, ML frameworks, and production-scale LLM inference deployment.	Serve	Engineering	Seattle, WA	5w ago	9
Research Engineer - LLM/VLM Inference Optimization (Seed Infra) Research Engineer focused on optimizing LLM/VLM inference systems, including engines, serving frameworks, and deployment pipelines, using advanced performance techniques and collaborating with research teams.	Serve	Engineering	San Jose, CA	5w ago	9
Senior Research Scientist/Engineer - AI Infrastructure Seeking an experienced Research Scientist/Engineer to design and build next-generation AI infrastructure at ByteDance, focusing on large-scale systems, AI, and emerging hardware to enable efficient and scalable AI workloads. The role involves architecting the end-to-end AI factory, exploring emerging trends, optimizing ML stack performance, and aligning cross-functional teams.	ServeData	Research	San Jose, CA	Jan 30	9
Senior Research Scientist - Machine Learning System Develop and optimize large-scale distributed ML training and inference systems, focusing on LLM inference frameworks and GPU/CUDA performance optimization for high-performance LLM inference engines.	Serve	Engineering	San Jose, CA	Aug '25	9
Tech Lead, Research Scientist/Engineer - AI Infrastructure Research Scientist/Engineer role focused on defining and building next-generation AI infrastructure for large-scale AI workloads, including training, RL, and inference, considering compute, storage, networking, chips, power, and data layers. The role involves tracking AI trends, optimizing system performance, and aligning cross-functional teams.	ServeData	Research	San Jose, CA	May '25	9
Research Engineer / Scientist - Storage for LLM Research Engineer/Scientist focused on designing and implementing a high-performance KV cache layer for LLM inference to improve latency, throughput, and cost-efficiency. This role involves optimizing intermediate state storage and retrieval for transformer-based LLMs, collaborating with inference and serving teams, and potentially extending open-source KV stores or building custom GPU-aware caching layers.	Serve	Engineering	Seattle, WA	May '25	9
AI Algorithm Expert - Hand Tracking, PICO - San Jose Develop and optimize high-precision, low-latency hand tracking algorithms for XR scenarios, including monocular/multiple vision and multi-sensor fusion. Build 3D gesture pose estimation models for challenging conditions, optimize real-time inference performance on mobile XR headsets, and lead the development of a multimodal ML interaction framework for natural XR interaction. Promote patent layout and publish papers in top conferences.	ServePost-train	Engineering	San Jose, CA	Sep '25	8
Senior Research Engineer / Scientist - Storage for LLM Senior Research Engineer/Scientist focused on designing and implementing a high-performance KV cache layer for LLM inference to improve latency, throughput, and cost-efficiency. This role involves optimizing caching for transformer-based models, collaborating with inference teams, and potentially extending open-source KV stores or building custom GPU-aware caching layers.	Serve	Engineering	Seattle, WA	May '25	8
Research Engineer / Scientist - Storage for LLM Research Engineer/Scientist focused on designing and implementing a high-performance KV cache layer for LLM inference to improve latency, throughput, and cost-efficiency in transformer-based model serving.	Serve	Research	San Jose, CA	May '25	8
Senior Research Engineer / Scientist -AI for Databases Research Engineer/Scientist focused on applying AI/ML to database management systems, including query optimization, indexing, workload forecasting, and developing self-managing databases. The role involves integrating AI models into production systems and publishing research findings.	ServeData	Research	Seattle, WA	May '25	8
Research Engineer / Scientist -AI for Databases Research Engineer/Scientist role focusing on applying AI/ML to database management systems, including query optimization, indexing, workload forecasting, and developing self-managing databases. The role involves research and development, integrating AI models into production systems, analyzing large datasets, and publishing findings. Requires a PhD and strong publication record in AI/databases/systems, with experience in database internals and ML frameworks.	ServeData	Research	Seattle, WA	May '25	8
Research Engineer / Scientist -AI for Databases Research Engineer/Scientist focused on applying AI/ML to database management systems, including query optimization, indexing, and workload forecasting, with a goal of building AI-native data infrastructure and intelligent optimization. The role involves research and development, integrating models into production, and publishing findings.	ServeData	Research	San Jose, CA	May '25	8
Machine Learning Engineer - Inference Machine Learning Engineer focused on designing, implementing, and optimizing distributed inference infrastructure for large-scale AI models in the consumer domain, specifically for ads, feeds, and search ranking.	Serve	Engineering	San Jose, CA	Mar '25	8
Tech Lead - Machine Learning Platform Engineer Machine Learning Platform Engineer to develop and maintain a platform supporting deep learning models for code development, testing, training, model deployment, and other core business functions. The platform is foundational for recommendation, advertising, and search systems, involving recommended systems and distributed training of large-scale deep learning models.	ServeData	Engineering	San Jose, CA	3w ago	7
Machine Learning Engineer - Orchestration Machine Learning Engineer focused on optimizing resource efficiency in distributed orchestration and scheduling for training and inference systems, particularly for large-scale recommendation models. The role involves building and optimizing training system architectures and online inference architectures, integrating with MLops processes, and working within Kubernetes/Godel ecosystems.	ServePost-train	Engineering	San Jose, CA	5w ago	7
Edge ML Software Engineer (Model Optimization-PICO) - San Jose Software Engineer focused on optimizing and deploying ML models for edge NPUs in VR/AR devices, involving quantization, performance profiling, and hardware-aware optimizations to meet latency, memory, and power constraints.	Serve	Engineering	San Jose, CA	6w ago	7
Edge ML Software Engineer (Compiler-PICO) - San Jose Software Engineer specializing in ML compilers for edge NPU architectures, focusing on optimizing latency, memory, power, and thermal constraints for ML inference on target hardware. Requires strong compiler and deep learning model understanding, with preferred experience in quantization and ML compiler stacks.	Serve	Engineering	San Jose, CA	6w ago	7
Edge ML Software Engineer (System Modeling-PICO) - San Jose Develop transaction-level models of edge NPU architectures for ML workloads (CNNs, Transformers) to simulate execution, analyze performance, and optimize for latency, memory, and power targets. Requires strong C/C++ and System C proficiency, computer architecture understanding, and experience with ML accelerator modeling.	Serve	Engineering	San Jose, CA	6w ago	7
Tech Lead Software Engineer - AI Compute Infrastructure The Tech Lead Software Engineer will design and build large-scale, container-based cluster management and orchestration systems with extreme performance, scalability, and resilience, focusing on GPU and AI accelerator infrastructure for LLM inference. This role involves architecting next-generation cloud-native systems, collaborating on inference solutions using various LLM engines, and contributing to open-source projects.	Serve	Engineering	San Jose, CA	Jan 9	7
Tech Lead Software Engineer - AI Compute Infrastructure Tech Lead Software Engineer focused on building and maintaining large-scale, Kubernetes-native LLM inference infrastructure (AIBrix). The role involves designing and architecting GPU-optimized orchestration systems for hyper-scale environments, collaborating on inference solutions using various LLM engines, and staying current with AI/ML infrastructure advancements.	Serve	Engineering	Seattle, WA	Jan 9	7
Research Scientist - DPU & AI Infra Research Scientist focused on DPU and AI infrastructure, aiming to accelerate distributed training and inference by co-designing software and hardware solutions. Explores AI/ML infrastructure acceleration leveraging DPUs, GPUs, and custom hardware.	ServeData	Research	San Jose, CA	Sep '25	7
Senior Research Scientist - DPU & AI Infra Research Scientist role focused on designing and developing DPU network software for AI/ML workloads, optimizing distributed training and inference, and exploring software-hardware co-design for cloud and AI computing infrastructure.	ServeData	Research	Seattle, WA	Sep '25	7
Research Scientist - DPU & AI Infra Research Scientist role focused on designing and developing DPU network software for AI/ML workloads, including distributed training and inference acceleration, and software-hardware co-design.	ServeData	Research	Seattle, WA	Sep '25	7
Tech Lead, Research Scientist - DPU & AI Infra Tech Lead, Research Scientist focused on DPU and AI infrastructure, optimizing distributed training and inference by leveraging DPUs, GPUs, and custom hardware. The role involves designing and developing high-performance network software, collaborating on software-hardware co-design, and driving end-to-end performance optimization.	ServeData	Engineering	Seattle, WA	Sep '25	7
Tech Lead, Research Scientist - DPU & AI Infra This role focuses on designing and developing DPU network software and exploring AI/ML infrastructure acceleration using DPUs, GPUs, and custom hardware to optimize distributed training and inference. It involves software-hardware co-design and end-to-end performance optimization for cloud-scale computing.	ServeData	Research	San Jose, CA	Sep '25	7
Senior Cloud Acceleration Engineer – DPU & AI Infra Senior Cloud Acceleration Engineer focused on DPU and AI infrastructure, involving software-hardware co-design to optimize distributed training and inference performance. Requires strong C/C++ and Linux systems development, with experience in networking, distributed systems, or AI/ML systems.	ServeAgent	Engineering	Seattle, WA	Sep '25	7
Senior Software Engineer - AI Compute Infrastructure Senior Software Engineer to design and build large-scale, container-based cluster management and orchestration systems for LLM inference, focusing on performance, scalability, and cost-efficiency. The role involves architecting GPU and AI accelerator infrastructure, collaborating on inference solutions using various LLM engines, and staying current with AI/ML infrastructure advancements.	Serve	Engineering	Seattle, WA	Sep '25	7
Software Engineer - AI Compute Infrastructure Software Engineer focused on building and maintaining large-scale, Kubernetes-native AI compute infrastructure for LLM inference, emphasizing performance, scalability, and cost-efficiency. The role involves architecting GPU-optimized systems and collaborating on inference solutions using various LLM engines.	Serve	Engineering	Seattle, WA	Sep '25	7
Software Engineer - AI Compute Infrastructure Software Engineer focused on building and maintaining large-scale, Kubernetes-native LLM inference infrastructure (AIBrix) with a focus on performance, scalability, and cost-efficiency. The role involves architecting GPU-optimized systems, collaborating on inference solutions using various LLM engines, and contributing to open-source projects.	Serve	Engineering	San Jose, CA	Sep '25	7
Cloud Acceleration Engineer – DPU & AI Infra This role focuses on designing and developing DPU network software and exploring AI/ML infrastructure acceleration, specifically for distributed training and inference. It involves software-hardware co-design and performance optimization of systems related to AI computing.	ServeData	Engineering	Seattle, WA	Sep '25	7
Cloud Acceleration Engineer – DPU & AI Infra ByteDance is seeking a Cloud Acceleration Engineer to focus on DPU and AI infrastructure. The role involves designing and developing high-performance DPU network software, collaborating on software-hardware co-design, and exploring AI/ML infrastructure acceleration for distributed training and inference. The position requires strong C/C++ and Linux systems development skills, with a background in areas like software-hardware co-design, distributed systems, networking, or AI/ML systems.	ServeData	Engineering	San Jose, CA	Sep '25	7
Tech Lead, AML Orchestration Tech Lead for an Applied Machine Learning (AML) team focused on building and advancing distributed orchestration platforms for recommendation systems, ads ranking, and search ranking. The role involves leading a team of ML Engineers, setting technical strategy for resource efficiency, distributed training, and online inference systems, and optimizing large-scale distributed orchestration and scheduling strategies.	ServeAgent	Engineering	San Jose, CA	Sep '25	7
Machine Learning Platform Engineer, Applied Machine Learning Team Machine Learning Platform Engineer to develop and maintain a platform supporting deep learning models for code development, testing, training, model deployment, and other core business functions. The role supports recommendation, advertising, and search systems, focusing on distributed training of large-scale deep learning models.	ServeData	Engineering	San Jose, CA	May '25	7
Software Engineer - Compute Infrastructure (Orchestration & Scheduling) Software Engineer role focused on building and optimizing large-scale compute infrastructure (Kubernetes, Serverless) to support AI and LLM workloads, including training and inference. The role involves enhancing cluster management, developing intelligent scheduling systems leveraging AI models for resource optimization, and leading infrastructure for next-gen ML workloads.	ServeAgent	Engineering	Seattle, WA	Apr '25	7
Senior Software Engineer - Compute Infrastructure (Orchestration & Scheduling) Senior Software Engineer focused on building and optimizing large-scale compute infrastructure (Kubernetes, Serverless) for AI and LLM workloads, including scheduling, resource management, and inference. The role involves developing intelligent scheduling systems using AI models and contributing to open-source projects.	ServeAgent	Engineering	Seattle, WA	Apr '25	7
Senior Software Engineer - Compute Infrastructure (Orchestration & Scheduling) Senior Software Engineer focused on building and optimizing large-scale compute infrastructure (Kubernetes, Serverless) for AI and LLM workloads, including scheduling, resource management, and inference. The role involves enhancing performance, scalability, and cost-efficiency for training and inference, with a focus on heterogeneous resources (CPU, GPU) and open-sourcing key technologies.	ServeAgent	Engineering	San Jose, CA	Apr '25	7
Software Engineer - Compute Infrastructure (Orchestration & Scheduling) Software Engineer role focused on building and optimizing large-scale compute infrastructure (Kubernetes, Serverless) for AI and LLM workloads, emphasizing resource efficiency, scheduling, and reliability. The role involves developing intelligent scheduling systems leveraging AI models and leading infrastructure for ML training/inference.	ServeAgent	Engineering	San Jose, CA	Apr '25	7
Machine Learning Engineer - PICO Perception - San Jose Machine Learning Engineer focused on optimizing and deploying AI algorithms on Qualcomm chips for XR devices, emphasizing low-power consumption and performance improvement. This role involves close collaboration with hardware vendors and contributing to the AI toolchain and technical ecosystem.	Serve	Engineering	San Jose, CA	Dec '24	7
Senior Site Reliability Engineer - Applied Machine Learning Site Reliability Engineer for an Applied Machine Learning team focused on next-generation recommendation algorithms and platforms. The role involves ensuring high availability and creating automated systems for large-scale AI/recommendation systems.	ServeShip	Engineering	San Jose, CA	Sep '21	7
AI/LLM Network Software Development Engineer Develops and optimizes high-speed network infrastructure and communication frameworks specifically for AI/LLM applications, focusing on performance, scalability, and reliability in large-scale data centers.	Serve	Engineering	Seattle, WA	Feb '21	7

Title

Stage

Function

Location

First seen

AI score

Research Engineer - LLM/VLM Inference Optimization (Seed Infra)

Research Engineer focused on optimizing LLM/VLM inference systems, including inference engines, serving frameworks, and deployment pipelines. Requires expertise in performance optimization techniques, C/C++, Python, ML frameworks, and production-scale LLM inference deployment.

Serve

Engineering

Seattle, WA

5w ago

Research Engineer - LLM/VLM Inference Optimization (Seed Infra)

Research Engineer focused on optimizing LLM/VLM inference systems, including engines, serving frameworks, and deployment pipelines, using advanced performance techniques and collaborating with research teams.

Serve

Engineering

San Jose, CA

5w ago

Senior Research Scientist/Engineer - AI Infrastructure

Seeking an experienced Research Scientist/Engineer to design and build next-generation AI infrastructure at ByteDance, focusing on large-scale systems, AI, and emerging hardware to enable efficient and scalable AI workloads. The role involves architecting the end-to-end AI factory, exploring emerging trends, optimizing ML stack performance, and aligning cross-functional teams.

ServeData

Research

San Jose, CA

Jan 30

Senior Research Scientist - Machine Learning System

Develop and optimize large-scale distributed ML training and inference systems, focusing on LLM inference frameworks and GPU/CUDA performance optimization for high-performance LLM inference engines.

Serve

Engineering

San Jose, CA

Aug '25

Tech Lead, Research Scientist/Engineer - AI Infrastructure

Research Scientist/Engineer role focused on defining and building next-generation AI infrastructure for large-scale AI workloads, including training, RL, and inference, considering compute, storage, networking, chips, power, and data layers. The role involves tracking AI trends, optimizing system performance, and aligning cross-functional teams.

ServeData

Research

San Jose, CA

May '25

Research Engineer / Scientist - Storage for LLM

Research Engineer/Scientist focused on designing and implementing a high-performance KV cache layer for LLM inference to improve latency, throughput, and cost-efficiency. This role involves optimizing intermediate state storage and retrieval for transformer-based LLMs, collaborating with inference and serving teams, and potentially extending open-source KV stores or building custom GPU-aware caching layers.

Serve

Engineering

Seattle, WA

May '25

AI Algorithm Expert - Hand Tracking, PICO - San Jose

Develop and optimize high-precision, low-latency hand tracking algorithms for XR scenarios, including monocular/multiple vision and multi-sensor fusion. Build 3D gesture pose estimation models for challenging conditions, optimize real-time inference performance on mobile XR headsets, and lead the development of a multimodal ML interaction framework for natural XR interaction. Promote patent layout and publish papers in top conferences.

ServePost-train

Engineering

San Jose, CA

Sep '25

Senior Research Engineer / Scientist - Storage for LLM

Senior Research Engineer/Scientist focused on designing and implementing a high-performance KV cache layer for LLM inference to improve latency, throughput, and cost-efficiency. This role involves optimizing caching for transformer-based models, collaborating with inference teams, and potentially extending open-source KV stores or building custom GPU-aware caching layers.

Serve

Engineering

Seattle, WA

May '25

Research Engineer / Scientist - Storage for LLM

Research Engineer/Scientist focused on designing and implementing a high-performance KV cache layer for LLM inference to improve latency, throughput, and cost-efficiency in transformer-based model serving.

Serve

Research

San Jose, CA

May '25

Senior Research Engineer / Scientist -AI for Databases

Research Engineer/Scientist focused on applying AI/ML to database management systems, including query optimization, indexing, workload forecasting, and developing self-managing databases. The role involves integrating AI models into production systems and publishing research findings.

ServeData

Research

Seattle, WA

May '25

Research Engineer / Scientist -AI for Databases

Research Engineer/Scientist role focusing on applying AI/ML to database management systems, including query optimization, indexing, workload forecasting, and developing self-managing databases. The role involves research and development, integrating AI models into production systems, analyzing large datasets, and publishing findings. Requires a PhD and strong publication record in AI/databases/systems, with experience in database internals and ML frameworks.

ServeData

Research

Seattle, WA

May '25

Research Engineer / Scientist -AI for Databases

Research Engineer/Scientist focused on applying AI/ML to database management systems, including query optimization, indexing, and workload forecasting, with a goal of building AI-native data infrastructure and intelligent optimization. The role involves research and development, integrating models into production, and publishing findings.

ServeData

Research

San Jose, CA

May '25

Machine Learning Engineer - Inference

Machine Learning Engineer focused on designing, implementing, and optimizing distributed inference infrastructure for large-scale AI models in the consumer domain, specifically for ads, feeds, and search ranking.

Serve

Engineering

San Jose, CA

Mar '25

Tech Lead - Machine Learning Platform Engineer

Machine Learning Platform Engineer to develop and maintain a platform supporting deep learning models for code development, testing, training, model deployment, and other core business functions. The platform is foundational for recommendation, advertising, and search systems, involving recommended systems and distributed training of large-scale deep learning models.

ServeData

Engineering

San Jose, CA

3w ago

Machine Learning Engineer - Orchestration

Machine Learning Engineer focused on optimizing resource efficiency in distributed orchestration and scheduling for training and inference systems, particularly for large-scale recommendation models. The role involves building and optimizing training system architectures and online inference architectures, integrating with MLops processes, and working within Kubernetes/Godel ecosystems.

ServePost-train

Engineering

San Jose, CA

5w ago

Edge ML Software Engineer (Model Optimization-PICO) - San Jose

Software Engineer focused on optimizing and deploying ML models for edge NPUs in VR/AR devices, involving quantization, performance profiling, and hardware-aware optimizations to meet latency, memory, and power constraints.

Serve

Engineering

San Jose, CA

6w ago

Edge ML Software Engineer (Compiler-PICO) - San Jose

Software Engineer specializing in ML compilers for edge NPU architectures, focusing on optimizing latency, memory, power, and thermal constraints for ML inference on target hardware. Requires strong compiler and deep learning model understanding, with preferred experience in quantization and ML compiler stacks.

Serve

Engineering

San Jose, CA

6w ago

Edge ML Software Engineer (System Modeling-PICO) - San Jose

Develop transaction-level models of edge NPU architectures for ML workloads (CNNs, Transformers) to simulate execution, analyze performance, and optimize for latency, memory, and power targets. Requires strong C/C++ and System C proficiency, computer architecture understanding, and experience with ML accelerator modeling.

Serve

Engineering

San Jose, CA

6w ago

Tech Lead Software Engineer - AI Compute Infrastructure

The Tech Lead Software Engineer will design and build large-scale, container-based cluster management and orchestration systems with extreme performance, scalability, and resilience, focusing on GPU and AI accelerator infrastructure for LLM inference. This role involves architecting next-generation cloud-native systems, collaborating on inference solutions using various LLM engines, and contributing to open-source projects.

Serve

Engineering

San Jose, CA

Jan 9

Tech Lead Software Engineer - AI Compute Infrastructure

Tech Lead Software Engineer focused on building and maintaining large-scale, Kubernetes-native LLM inference infrastructure (AIBrix). The role involves designing and architecting GPU-optimized orchestration systems for hyper-scale environments, collaborating on inference solutions using various LLM engines, and staying current with AI/ML infrastructure advancements.

Serve

Engineering

Seattle, WA

Jan 9

Research Scientist - DPU & AI Infra

Research Scientist focused on DPU and AI infrastructure, aiming to accelerate distributed training and inference by co-designing software and hardware solutions. Explores AI/ML infrastructure acceleration leveraging DPUs, GPUs, and custom hardware.

ServeData

Research

San Jose, CA

Sep '25

Senior Research Scientist - DPU & AI Infra

Research Scientist role focused on designing and developing DPU network software for AI/ML workloads, optimizing distributed training and inference, and exploring software-hardware co-design for cloud and AI computing infrastructure.

ServeData

Research

Seattle, WA

Sep '25

Research Scientist - DPU & AI Infra

Research Scientist role focused on designing and developing DPU network software for AI/ML workloads, including distributed training and inference acceleration, and software-hardware co-design.

ServeData

Research

Seattle, WA

Sep '25

Tech Lead, Research Scientist - DPU & AI Infra

Tech Lead, Research Scientist focused on DPU and AI infrastructure, optimizing distributed training and inference by leveraging DPUs, GPUs, and custom hardware. The role involves designing and developing high-performance network software, collaborating on software-hardware co-design, and driving end-to-end performance optimization.

ServeData

Engineering

Seattle, WA

Sep '25

Tech Lead, Research Scientist - DPU & AI Infra

This role focuses on designing and developing DPU network software and exploring AI/ML infrastructure acceleration using DPUs, GPUs, and custom hardware to optimize distributed training and inference. It involves software-hardware co-design and end-to-end performance optimization for cloud-scale computing.

ServeData

Research

San Jose, CA

Sep '25

Senior Cloud Acceleration Engineer – DPU & AI Infra

Senior Cloud Acceleration Engineer focused on DPU and AI infrastructure, involving software-hardware co-design to optimize distributed training and inference performance. Requires strong C/C++ and Linux systems development, with experience in networking, distributed systems, or AI/ML systems.

ServeAgent

Engineering

Seattle, WA

Sep '25

Senior Software Engineer - AI Compute Infrastructure

Senior Software Engineer to design and build large-scale, container-based cluster management and orchestration systems for LLM inference, focusing on performance, scalability, and cost-efficiency. The role involves architecting GPU and AI accelerator infrastructure, collaborating on inference solutions using various LLM engines, and staying current with AI/ML infrastructure advancements.

Serve

Engineering

Seattle, WA

Sep '25

Software Engineer - AI Compute Infrastructure

Software Engineer focused on building and maintaining large-scale, Kubernetes-native AI compute infrastructure for LLM inference, emphasizing performance, scalability, and cost-efficiency. The role involves architecting GPU-optimized systems and collaborating on inference solutions using various LLM engines.

Serve

Engineering

Seattle, WA

Sep '25

Software Engineer - AI Compute Infrastructure

Software Engineer focused on building and maintaining large-scale, Kubernetes-native LLM inference infrastructure (AIBrix) with a focus on performance, scalability, and cost-efficiency. The role involves architecting GPU-optimized systems, collaborating on inference solutions using various LLM engines, and contributing to open-source projects.

Serve

Engineering

San Jose, CA

Sep '25

Cloud Acceleration Engineer – DPU & AI Infra

This role focuses on designing and developing DPU network software and exploring AI/ML infrastructure acceleration, specifically for distributed training and inference. It involves software-hardware co-design and performance optimization of systems related to AI computing.

ServeData

Engineering

Seattle, WA

Sep '25

Cloud Acceleration Engineer – DPU & AI Infra

ByteDance is seeking a Cloud Acceleration Engineer to focus on DPU and AI infrastructure. The role involves designing and developing high-performance DPU network software, collaborating on software-hardware co-design, and exploring AI/ML infrastructure acceleration for distributed training and inference. The position requires strong C/C++ and Linux systems development skills, with a background in areas like software-hardware co-design, distributed systems, networking, or AI/ML systems.

ServeData

Engineering

San Jose, CA

Sep '25

Tech Lead, AML Orchestration

Tech Lead for an Applied Machine Learning (AML) team focused on building and advancing distributed orchestration platforms for recommendation systems, ads ranking, and search ranking. The role involves leading a team of ML Engineers, setting technical strategy for resource efficiency, distributed training, and online inference systems, and optimizing large-scale distributed orchestration and scheduling strategies.

ServeAgent

Engineering

San Jose, CA

Sep '25

Machine Learning Platform Engineer, Applied Machine Learning Team

Machine Learning Platform Engineer to develop and maintain a platform supporting deep learning models for code development, testing, training, model deployment, and other core business functions. The role supports recommendation, advertising, and search systems, focusing on distributed training of large-scale deep learning models.

ServeData

Engineering

San Jose, CA

May '25

Software Engineer - Compute Infrastructure (Orchestration & Scheduling)

Software Engineer role focused on building and optimizing large-scale compute infrastructure (Kubernetes, Serverless) to support AI and LLM workloads, including training and inference. The role involves enhancing cluster management, developing intelligent scheduling systems leveraging AI models for resource optimization, and leading infrastructure for next-gen ML workloads.

ServeAgent

Engineering

Seattle, WA

Apr '25

Senior Software Engineer - Compute Infrastructure (Orchestration & Scheduling)

Senior Software Engineer focused on building and optimizing large-scale compute infrastructure (Kubernetes, Serverless) for AI and LLM workloads, including scheduling, resource management, and inference. The role involves developing intelligent scheduling systems using AI models and contributing to open-source projects.

ServeAgent

Engineering

Seattle, WA

Apr '25

Senior Software Engineer - Compute Infrastructure (Orchestration & Scheduling)

Senior Software Engineer focused on building and optimizing large-scale compute infrastructure (Kubernetes, Serverless) for AI and LLM workloads, including scheduling, resource management, and inference. The role involves enhancing performance, scalability, and cost-efficiency for training and inference, with a focus on heterogeneous resources (CPU, GPU) and open-sourcing key technologies.

ServeAgent

Engineering

San Jose, CA

Apr '25

Software Engineer - Compute Infrastructure (Orchestration & Scheduling)

Software Engineer role focused on building and optimizing large-scale compute infrastructure (Kubernetes, Serverless) for AI and LLM workloads, emphasizing resource efficiency, scheduling, and reliability. The role involves developing intelligent scheduling systems leveraging AI models and leading infrastructure for ML training/inference.

ServeAgent

Engineering

San Jose, CA

Apr '25

Machine Learning Engineer - PICO Perception - San Jose

Machine Learning Engineer focused on optimizing and deploying AI algorithms on Qualcomm chips for XR devices, emphasizing low-power consumption and performance improvement. This role involves close collaboration with hardware vendors and contributing to the AI toolchain and technical ecosystem.

Serve

Engineering

San Jose, CA

Dec '24

Senior Site Reliability Engineer - Applied Machine Learning

Site Reliability Engineer for an Applied Machine Learning team focused on next-generation recommendation algorithms and platforms. The role involves ensuring high availability and creating automated systems for large-scale AI/recommendation systems.

ServeShip

Engineering

San Jose, CA

Sep '21

AI/LLM Network Software Development Engineer

Develops and optimizes high-speed network infrastructure and communication frameworks specifically for AI/LLM applications, focusing on performance, scalability, and reliability in large-scale data centers.

Serve

Engineering

Seattle, WA

Feb '21