Software Engineer - AI Agent Memory Infrastructure

ByteDance ByteDance · Big Tech · San Jose, CA · R&D

This role focuses on building and scaling the core memory infrastructure for AI agents. It involves designing, developing, and optimizing large-scale, low-latency systems for storing, retrieving, and updating memory, with a focus on multimodal data and integration with LLMs. The goal is to enable more personalized and context-aware AI experiences by creating a unified platform for various memory types.

What you'd actually do

  1. Design, build, and evolve the next-generation memory infrastructure for AI agents, developing a unified platform that supports long-term memory, conversational memory, and task-oriented memory.
  2. Architect and optimize memory system pipelines for large-scale, low-latency, and high-availability environments, including data ingestion, storage, indexing, retrieval, updating, compression, and forgetting mechanisms to support real-time inference and personalized interactions.
  3. Explore key challenges at the intersection of large language models, context engineering, and data management, including memory representation, retrieval and ranking, conflict resolution, summarization and fusion, and memory lifecycle management.
  4. Design unified memory models and processing workflows for multimodal data (text, image, audio, behavioral signals), enhancing agents’ long-term consistency, personalization, and task completion in complex scenarios.
  5. Collaborate closely with model, application, and platform teams to productionize memory capabilities, and continuously optimize system performance across quality, latency, cost, reliability, and safety.

Skills

Required

  • distributed systems
  • databases
  • information retrieval systems
  • AI infrastructure
  • system design
  • production engineering
  • Go
  • Python
  • C++
  • embeddings
  • retrieval-augmented generation (RAG)
  • context engineering
  • retrieval systems
  • long-term state management
  • memory extraction and representation
  • vector/graph indexing
  • retrieval and ranking
  • memory updating
  • compression and forgetting
  • multimodal memory fusion
  • system performance optimization
  • latency optimization
  • cost optimization
  • scalability optimization

Nice to have

  • agent memory systems
  • user profiling
  • recommendation/search feature platforms
  • knowledge base systems
  • mem0
  • memOS
  • memU
  • publications
  • open-source work
  • technical achievements
  • multimodal data processing
  • online inference systems
  • personalized agents
  • long-term user state modeling

What the JD emphasized

  • large-scale, low-latency, and highly reliable memory infrastructure
  • large-scale, low-latency, and high-availability environments
  • memory representation, retrieval and ranking, conflict resolution, summarization and fusion, and memory lifecycle management
  • multimodal data (text, image, audio, behavioral signals)
  • productionize memory capabilities
  • complex production systems is highly preferred

Other signals

  • building core memory systems for AI agents
  • unified platform for long-term, conversational, and task-oriented memory
  • large-scale, low-latency, and highly reliable memory infrastructure
  • intersection of LLMs, data systems, and context engineering
  • memory representation, retrieval, and multimodal fusion
  • productionize memory capabilities
  • support a wide range of AI-driven applications