ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Towards Accelerating k-NN with MPI and Near-Memory Processing
Cited 0 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Hoo-Young Ahn, Seon Young Kim, Yoomi Park, Woojong Han, Nick Contini, Bharath Ramesh, Mustafa Abduljabbar, Dhabaleswar K. Panda
Issue Date
2024-05
Citation
International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2024, pp.608-615
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/IPDPSW63119.2024.00119
Abstract
Message Passing Interface (MPI) is a common parallel programming model in High-Performance Computing fields. Recently, it has been widely used in Artificial Intelligence (AI) applications. However, the performance of those applications is limited by the memory wall problem, which is the performance gap between the processor and memory. To address these problems, we propose a novel computing architecture for accelerating k-NN query, which is the key operation of AI applications. The proposed computing architecture involves two or more computing nodes within a rack using the memory of an NMP device as a communication buffer and the accelerator of the NMP device for MPI collective communication. The superiority of the proposed method is that it reduces data copying during local computation and eliminates the network cost of gathering local computation results for global computation. Furthermore, it can enhance performance by processing data in the device's memory through near-memory processing. However, since these NMP devices are still in development, with no commercially available products currently, we have undertaken the task of developing them ourselves. Despite these circumstances, to demonstrate the feasibility of our proposed approach, we have implemented it in a single-node environment without making any unreasonable assumptions and have compared it with conventional approaches. Although we are using a single-node environment, we have designed a more sophisticated architecture, analyzed the time complexity, and compared the estimated performance of both the conventional and proposed approaches. We can conclude that the proposed computing architecture is feasible, and we anticipate an even more performance enhancement in a multi-node environment.
KSP Keywords
AI Applications, Enhance performance, Estimated performance, High-performance computing(HPC), K-Nearest Neighbor(KNN), Local computation, MPI collective communication, Memory wall, Message Passing Interface, Multi-node, NN query