What you'd actually do

Conduct research and development in applying AI/ML techniques to database management systems.

Develop intelligent algorithms for tasks such as query planning, indexing, storage management, and workload prediction/scheduling.

Collaborate with data infrastructure and engineering teams to integrate AI models into production systems.

Analyze large-scale datasets from database workloads to uncover optimization opportunities.

Publish findings in top-tier conferences and journals (VLDB, SIGMOD, ICDE, NeurIPS, etc.).

Skills

Required

PhD in Computer Science, Data Science or a related field with a focus on databases, systems, or machine learning
Strong publication record in accredited venues (e.g., SIGMOD, VLDB, ICDE, NeurIPS, etc.) related to the AI4DB area
Strong background in database internals (e.g., PostgreSQL, MySQL, or any modern cloud-native databases or BigData platform)
Hands-on experience with machine learning frameworks (e.g. XGBoost, LightGBM, TensorFlow, PyTorch, scikit-learn)

Nice to have

Proficiency in Python, C++, or Java
Experience with cloud database platforms (AWS, GCP, Azure)
Strong analytical, problem-solving, and communication skills
Familiarity with LLM, reinforcement learning, neural architecture search, or automated database tuning

Other signals

AI-native data infrastructure

intelligent infrastructure optimization

LLM-based developer tools

high-performance cache systems for distributed storage and LLM inference

applying AI/ML techniques to database management systems

intelligent algorithms for query planning, indexing, storage management, and workload prediction/scheduling

integrate AI models into production systems

About the Team The Infrastructure System Lab is a hybrid research and engineering team dedicated to building the next generation of AI-native data infrastructure. Operating at the crossroads of databases, large-scale systems, and AI, the team innovates across multiple domains, including advanced VectorDBs and multi-modal databases for large-scale retrieval and reasoning, intelligent infrastructure optimization using machine learning, LLM-based developer tools like NL2SQL and NL2Chart, and high-performance cache systems for distributed storage and LLM inference. The lab is deeply collaborative, with researchers and engineers working side by side to turn groundbreaking ideas into production-ready systems. Their work is deployed at scale, powering products used by millions, and frequently shared through publications and open-source contributions.

About the Role We are looking for a passionate and skilled professional to join our AI4DB team, where artificial intelligence meets cutting-edge database technology. In this role, you will design and implement intelligent systems that improve the performance, scalability, and usability of modern databases. Your work will span query optimization, indexing strategies, workload forecasting, and the development of self-managing databases. This role offers the chance to solve complex, high-impact problems at the intersection of AI, systems, and software engineering, while collaborating with a top-tier team. You'll have opportunities to publish, contribute to open-source, attend leading conferences, and benefit from competitive compensation, strong research support, and an innovation-driven environment.

Responsibilities

Conduct research and development in applying AI/ML techniques to database management systems.
Develop intelligent algorithms for tasks such as query planning, indexing, storage management, and workload prediction/scheduling.
Collaborate with data infrastructure and engineering teams to integrate AI models into production systems.
Analyze large-scale datasets from database workloads to uncover optimization opportunities.
Publish findings in top-tier conferences and journals (VLDB, SIGMOD, ICDE, NeurIPS, etc.).
Contribute to open-source projects or internal tools supporting AI-enhanced databases.

Requirements

Minimum Qualifications

PhD in Computer Science, Data Science or a related field with a focus on databases, systems, or machine learning.
Strong publication record in accredited venues (e.g., SIGMOD, VLDB, ICDE, NeurIPS, etc.) related to the AI4DB area.
Strong background in database internals (e.g., PostgreSQL, MySQL, or any modern cloud-native databases or BigData platform).
Hands-on experience with machine learning frameworks (e.g. XGBoost, LightGBM, TensorFlow, PyTorch, scikit-learn).

Preferred Qualifications

Proficiency in Python, C++, or Java.
Experience with cloud database platforms (AWS, GCP, Azure) is a plus.
Strong analytical, problem-solving, and communication skills.
Familiarity with LLM, reinforcement learning, neural architecture search, or automated database tuning.

Responsibilities

Conduct research and development in applying AI/ML techniques to database management systems.
Develop intelligent algorithms for tasks such as query planning, indexing, storage management, and workload prediction/scheduling.
Collaborate with data infrastructure and engineering teams to integrate AI models into production systems.
Analyze large-scale datasets from database workloads to uncover optimization opportunities.
Publish findings in top-tier conferences and journals (VLDB, SIGMOD, ICDE, NeurIPS, etc.).
Contribute to open-source projects or internal tools supporting AI-enhanced databases.

Requirements

Minimum Qualifications

PhD in Computer Science, Data Science or a related field with a focus on databases, systems, or machine learning.
Strong publication record in accredited venues (e.g., SIGMOD, VLDB, ICDE, NeurIPS, etc.) related to the AI4DB area.
Strong background in database internals (e.g., PostgreSQL, MySQL, or any modern cloud-native databases or BigData platform).
Hands-on experience with machine learning frameworks (e.g. XGBoost, LightGBM, TensorFlow, PyTorch, scikit-learn).

Preferred Qualifications

Proficiency in Python, C++, or Java.
Experience with cloud database platforms (AWS, GCP, Azure) is a plus.
Strong analytical, problem-solving, and communication skills.
Familiarity with LLM, reinforcement learning, neural architecture search, or automated database tuning.

Research Engineer / Scientist -ai for Databases

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Requirements

Requirements