ETRI Knowledge Sharing Platform : VEX: Scaling HNSW-Based Vector Search with DPU Memory and Parallelism

Titles

논문 검색
Type		SCI
Year	~	Keyword

List

Conference Paper VEX: Scaling HNSW-Based Vector Search with DPU Memory and Parallelism

Cited - time in scopus

Authors: Kihwan Kim, Hyungsun Yoo, Woojung Kim, Donghyun Min, Myungcheol Lee, Jihoon Yang, Weikuan Yu, Youngjae Kim

Citation: International Symposium on Cluster, Cloud and Internet Computing (CCGrid) 2026, pp.243-253

Abstract: Vector similarity search is a core component of modern AI services, and HNSW is widely adopted due to its high recall and low latency. However, its memory-intensive design makes billion-scale deployment difficult, and performance collapses when relying on swapping or remote memory. This paper targets recent DPUs (SmartNICs) with substantially improved compute capability and onboard DRAM, and proposes VEX, a host–DPU integrated vector search system that uses the DPU as both an extended memory tier and a parallel search engine for HNSW. VEX (i) partitions and places independent HNSW indices on the host and DPU while preserving semantic structure, (ii) minimizes host–DPU overhead via a dual-path DMA-based communication design, and (iii) overlaps search, communication, and aggregation with heterogeneity-aware pipelining. Experiments show that under memory pressure requiring disk access, VEX delivers 5–10× higher throughput than DiskANN at stable Recall@100. Even in ideal settings where the index fully resides in memory, VEX outperforms in-memory HNSW by up to 1.9× in query throughput.

Keyword: Vector Search, ANN, HNSW, DPU, Smart- NIC, DMA, Heterogeneous architecture, Memory disaggregation, Hardware-Software Co-Design

KSP Keywords: Communication Design, Extended memory, Hardware-software co-design, Heterogeneous architectures, High recall, Low Latency, Parallel search, Remote memory, Search Engine, Similarity search, in-memory

218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KOREA, Contact: sh.kim@etri.re.kr

Please refrain from automatic collection of e-mail addresses posted on this homepage.