Background and Motivation
Universal vector representation and the essentiality of vector databases.
Due to the rapid advancements in AI technology, particularly in deep learning and large language models, vectors are believed to be the universal data representation of the future, connecting various modalities and domains. This underscores the necessity of vector databases to manage these vectors effectively and efficiently support different query types on vectors. This tutorial covers recent advances and challenges in query processing of vector databases. We will begin with an introduction to the background and motivation behind the emergence of vector databases. Next, we will review the techniques that focus on various query types: similarity search, multi-similarity search, filtered similarity search, and similarity join. Lastly, we will provide open challenges and future research directions, intending to foster innovation in the field.
Universal vector representation and the essentiality of vector databases.
Proximity graphs, quantization, maintenance, distance computation, hardware acceleration, OOD queries, and secure search.
Multi-dense vector search and dense-sparse combined similarity search.
Universal and dedicated indexes for categorical, numerical, range, interval, and timestamp filters.
Exact and approximate similarity join techniques for high-dimensional vectors.
Bridging guarantees and practicality; toward unified query processing in vector databases.
Ph.D. Candidate, The Chinese University of Hong Kong
Jiadong Xie is a Ph.D. candidate in the Department of Systems Engineering and Engineering Management at The Chinese University of Hong Kong, supervised by Prof. Jeffrey Xu Yu and Prof. Hong Cheng. He received the B.Eng degree in Software Engineering from East China Normal University. His research focuses on vector data management and graph algorithms. He has published papers in conferences and journals, such as SIGMOD, VLDB, TODS, VLDBJ, KDD, TheWebConf, ICDE, OSDI, etc. He has served as a conference program committee member or reviewer for TheWebConf, NeurIPS, KDD, ICDM, CIKM, and IEEE Big Data, and as a journal reviewer for TODS, TKDE, and TKDD.
Associate Prof., Xidian University
Yingfan Liu is a tenured Associate Professor in the School of Computer Science and Technology at Xidian University. He received his Ph.D. degree in Systems Engineering and Engineering Management from The Chinese University of Hong Kong in 2019, and received his Bachelor and Master degrees from Xidian University in 2011 and 2014. His research focuses on vector databases, high-performance computing, and LLM inference. His recent publications have appeared in venues such as KDD, SIGMOD, WWW, VLDB, ICDE, and TODS.
Professor, The Hong Kong University of Science and Technology (Guangzhou)
Jeffrey Xu Yu is a Professor and Acting Head of the Data Science and Analytics Thrust at The Hong Kong University of Science and Technology (Guangzhou). His current research interests include vector databases, graph algorithms and systems, graph neural networks, query processing, and optimization. He has served or serves on over 300 organization committees and program committees in international conferences and workshops, including PC co-chair roles for APWeb, WAIM, APWeb/WAIM, WISE, PAKDD, DASFAA, ICDM, NDBC, ADMA, CIKM, BigComp, and DSAA, as well as general co-chair roles for APWeb and ICDM. He has also served as Information Director and member of the ACM SIGMOD Executive Committee, Associate Editor of IEEE TKDE and the VLDB Journal, and currently serves as Associate Editor for ACM TODS and WWW Journal.
@article{vector2026tutorial,
title = {Advances of Query Processing in Vector Databases},
author = {Xie, Jiadong and Liu, Yingfan and Yu, Jeffrey Xu},
journal = {Proceedings of the VLDB Endowment},
volume = {19},
number = {12},
pages = {XXX--XXX},
year = {2026},
doi = {XX.XX/XXX.XX}
}
@article{vector2026survey,
title = {A Survey on Query Processing in Vector Databases},
author = {Xie, Jiadong and Liu, Yingfan and Yu, Jeffrey Xu},
journal = {engrXiv},
year = {2026},
doi = {10.31224/7009},
url = {https://doi.org/10.31224/7009}
}