Software Engineer - AI Agent Memory Infrastructure

ByteDance · Big Tech · San Jose, CA · R&D

This role focuses on building and scaling the core memory infrastructure for AI agents. It involves designing, developing, and optimizing large-scale, low-latency systems for storing, retrieving, and updating memory, with a focus on multimodal data and integration with LLMs. The goal is to enable more personalized and context-aware AI experiences by creating a unified platform for various memory types.

What you'd actually do

Design, build, and evolve the next-generation memory infrastructure for AI agents, developing a unified platform that supports long-term memory, conversational memory, and task-oriented memory.
Architect and optimize memory system pipelines for large-scale, low-latency, and high-availability environments, including data ingestion, storage, indexing, retrieval, updating, compression, and forgetting mechanisms to support real-time inference and personalized interactions.
Explore key challenges at the intersection of large language models, context engineering, and data management, including memory representation, retrieval and ranking, conflict resolution, summarization and fusion, and memory lifecycle management.
Design unified memory models and processing workflows for multimodal data (text, image, audio, behavioral signals), enhancing agents’ long-term consistency, personalization, and task completion in complex scenarios.
Collaborate closely with model, application, and platform teams to productionize memory capabilities, and continuously optimize system performance across quality, latency, cost, reliability, and safety.

Skills

Required

distributed systems
databases
information retrieval systems
AI infrastructure
system design
production engineering
Go
Python
C++
embeddings
retrieval-augmented generation (RAG)
context engineering
retrieval systems
long-term state management
memory extraction and representation
vector/graph indexing
retrieval and ranking
memory updating
compression and forgetting
multimodal memory fusion
system performance optimization
latency optimization
cost optimization
scalability optimization

Nice to have

agent memory systems
user profiling
recommendation/search feature platforms
knowledge base systems
mem0
memOS
memU
publications
open-source work
technical achievements
multimodal data processing
online inference systems
personalized agents
long-term user state modeling

What the JD emphasized

large-scale, low-latency, and highly reliable memory infrastructure
large-scale, low-latency, and high-availability environments
memory representation, retrieval and ranking, conflict resolution, summarization and fusion, and memory lifecycle management
multimodal data (text, image, audio, behavioral signals)
productionize memory capabilities
complex production systems is highly preferred

Other signals

building core memory systems for AI agents
unified platform for long-term, conversational, and task-oriented memory
large-scale, low-latency, and highly reliable memory infrastructure
intersection of LLMs, data systems, and context engineering
memory representation, retrieval, and multimodal fusion
productionize memory capabilities
support a wide range of AI-driven applications

Apply on company site

● Active

Posted 2mo ago · last seen 1w ago · 57 days open

AI score: 8/10
Stage: Agent
Location: San Jose, CA
Role: Senior · Builder
Function: Engineering
Domain: enterprise_ai
Team: AI Agent Memory Infrastructure
Maturity: Scaling

Skills

Agents & Autonomy

Agent MemoryContext EngineeringPlanning & Reasoning

Applied ML Domains

NLP ApplicationsPersonalization

Computer Vision & Multimodal

Multimodal AISensor Fusion

Data Engineering

Data LabelingData Warehousing

Frameworks & Tools

Unix/Linux

General Experience & Skills

Performance OptimizationSystem Design

Infrastructure & Systems

Model ServingReal-Time SystemsReliability EngineeringStorage Systems

LLM & Foundation Models

Large Language Models (LLMs)Model Compression

ML Ops & Evaluation

Model Lifecycle ManagementProduction ML SystemsResponsible AI & Safety

ML Techniques

Continual LearningImitation LearningOptimization Methods

NLP & Language

Conversational AIInformation RetrievalText Summarization

Retrieval & Search

Ranking & Relevance

Speech & Audio

Speech & Audio Processing

Read full job description

About the Team Join ByteDance’s AI Agent Memory Infrastructure team, where we build the core memory systems that power next-generation intelligent agents. Our focus is on creating a unified platform for long-term, conversational, and task-oriented memory, enabling more personalized and context-aware AI experiences. We design and operate large-scale, low-latency, and highly reliable memory infrastructure, covering the full lifecycle from storage and retrieval to updating and optimization. Working at the intersection of LLMs, data systems, and context engineering, we tackle challenges in memory representation, retrieval, and multimodal fusion.

Partnering closely with model and product teams, we turn advanced research into scalable production systems that support a wide range of AI-driven applications.

Responsibilities

Design, build, and evolve the next-generation memory infrastructure for AI agents, developing a unified platform that supports long-term memory, conversational memory, and task-oriented memory.
Architect and optimize memory system pipelines for large-scale, low-latency, and high-availability environments, including data ingestion, storage, indexing, retrieval, updating, compression, and forgetting mechanisms to support real-time inference and personalized interactions.
Explore key challenges at the intersection of large language models, context engineering, and data management, including memory representation, retrieval and ranking, conflict resolution, summarization and fusion, and memory lifecycle management.
Design unified memory models and processing workflows for multimodal data (text, image, audio, behavioral signals), enhancing agents’ long-term consistency, personalization, and task completion in complex scenarios.
Collaborate closely with model, application, and platform teams to productionize memory capabilities, and continuously optimize system performance across quality, latency, cost, reliability, and safety.
Stay up-to-date with cutting-edge advancements and contribute to the long-term technical roadmap of AI agent memory systems, driving innovation and capability evolution.

Requirements

Minimum Qualifications

Bachelor’s degree or higher in Computer Science, Artificial Intelligence, Data Science, or related fields.
Strong experience in distributed systems, databases, information retrieval systems, or AI infrastructure, with proven system design and production engineering capabilities.
Proficient in at least one programming language such as Go, Python, or C++, with strong coding standards and engineering best practices.
Solid understanding of core technologies in LLM applications, including but not limited to embeddings, retrieval-augmented generation (RAG), context engineering, retrieval systems, and long-term state management.
Familiarity with one or more key areas in memory systems: memory extraction and representation, vector/graph indexing, retrieval and ranking, memory updating, compression and forgetting, multimodal memory fusion.
Ability to analyze and optimize trade-offs across system performance, latency, cost, and scalability from both system and algorithm perspectives;

Preferred Qualifications

Experience in agent memory systems, user profiling, recommendation/search feature platforms, or knowledge base systems.
Contributions to or deep understanding of open-source memory frameworks such as mem0, memOS, memU, or similar solutions.
Strong track record in databases, information retrieval, machine learning, or AI systems, including publications, impactful open-source work, or notable technical achievements.
Experience in multimodal data processing, online inference systems, personalized agents, or long-term user state modeling.
Experience with complex production systems is highly preferred.

Partnering closely with model and product teams, we turn advanced research into scalable production systems that support a wide range of AI-driven applications.

Responsibilities

Design, build, and evolve the next-generation memory infrastructure for AI agents, developing a unified platform that supports long-term memory, conversational memory, and task-oriented memory.
Architect and optimize memory system pipelines for large-scale, low-latency, and high-availability environments, including data ingestion, storage, indexing, retrieval, updating, compression, and forgetting mechanisms to support real-time inference and personalized interactions.
Explore key challenges at the intersection of large language models, context engineering, and data management, including memory representation, retrieval and ranking, conflict resolution, summarization and fusion, and memory lifecycle management.
Design unified memory models and processing workflows for multimodal data (text, image, audio, behavioral signals), enhancing agents’ long-term consistency, personalization, and task completion in complex scenarios.
Collaborate closely with model, application, and platform teams to productionize memory capabilities, and continuously optimize system performance across quality, latency, cost, reliability, and safety.
Stay up-to-date with cutting-edge advancements and contribute to the long-term technical roadmap of AI agent memory systems, driving innovation and capability evolution.

Requirements

Minimum Qualifications

Bachelor’s degree or higher in Computer Science, Artificial Intelligence, Data Science, or related fields.
Strong experience in distributed systems, databases, information retrieval systems, or AI infrastructure, with proven system design and production engineering capabilities.
Proficient in at least one programming language such as Go, Python, or C++, with strong coding standards and engineering best practices.
Solid understanding of core technologies in LLM applications, including but not limited to embeddings, retrieval-augmented generation (RAG), context engineering, retrieval systems, and long-term state management.
Familiarity with one or more key areas in memory systems: memory extraction and representation, vector/graph indexing, retrieval and ranking, memory updating, compression and forgetting, multimodal memory fusion.
Ability to analyze and optimize trade-offs across system performance, latency, cost, and scalability from both system and algorithm perspectives;

Preferred Qualifications

Experience in agent memory systems, user profiling, recommendation/search feature platforms, or knowledge base systems.
Contributions to or deep understanding of open-source memory frameworks such as mem0, memOS, memU, or similar solutions.
Strong track record in databases, information retrieval, machine learning, or AI systems, including publications, impactful open-source work, or notable technical achievements.
Experience in multimodal data processing, online inference systems, personalized agents, or long-term user state modeling.
Experience with complex production systems is highly preferred.