Research Scientist, Operations Research (infrastructure Lab)

ByteDance ByteDance · Big Tech · San Jose, CA · Infrastructure

Research Scientist role focusing on operations research for AI-native data infrastructure. The role involves designing and optimizing vector indexing algorithms for vector databases, and exploring the integration of LLM, RL, and Agent technologies into operations research optimization pipelines. This includes developing AI for infrastructure optimization and LLM-based tooling like NL2SQL.

What you'd actually do

  1. For scenarios such as AI data centers and cloud resource scheduling, understand business requirements, formulate mathematical models, and design and develop efficient algorithms, heuristic algorithms, and meta-heuristic algorithms for optimization problems.
  2. Explore AI for OR by integrating LLM, RL and Agent technologies into the operations research optimization pipeline, including but not limited to: Natural language-based decision engine interfaces & Enhancing the interpretability of optimization results

Skills

Required

  • operations research theory
  • linear programming
  • integer programming
  • combinatorial optimization
  • commercial or open-source solver (e.g., Gurobi, CPLEX, CP-SAT)
  • meta-heuristic algorithms (e.g., genetic algorithms, simulated annealing)
  • Python
  • C++
  • Java
  • data structures
  • algorithms

Nice to have

  • datacenter hardware supply chain operations
  • cloud computing products implementation
  • LLMs
  • reinforcement learning
  • Agent frameworks (e.g., LangGraph)
  • prompt engineering optimization
  • Agent development
  • combining traditional operations research optimization with generative AI

What the JD emphasized

  • Ph.D. degree with strong research achievements, such as multiple first-author papers at conferences (CCF-A) in the areas of Data, Systems, or AI.
  • Solid foundation in operations research theory, with expertise in areas such as linear programming, integer programming, and combinatorial optimization.
  • Proficient with at least one mainstream commercial or open-source solver (e.g., Gurobi, CPLEX, CP-SAT). Familiar with commonly used (meta-)heuristic algorithms (e.g., genetic algorithms, simulated annealing) and experienced in real-world deployment.
  • Strong engineering and coding skills, proficient in at least one programming language such as Python, C++, or Java, with solid knowledge of common data structures and algorithms.
  • Excellent logical thinking and business abstraction skills, capable of translating ambiguous business requirements into clear technical solutions. Strong communication, teamwork, and collaboration abilities.

Other signals

  • AI x systems
  • vector indexing algorithms
  • vector database infrastructure
  • LLM, RL and Agent technologies into the operations research optimization pipeline