ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술지 Moving Metadata from Ad Hoc Files to Database Tables for Robust, Highly Available, and Scalable HDFS
Cited 12 time in scopus Download 9 time Share share facebook twitter linkedin kakaostory
저자
원희선, 차우, 길명선, 문양세, 황규영
발행일
201706
출처
Journal of Supercomputing, v.73 no.6, pp.2657-2681
ISSN
0920-8542
출판사
Springer
DOI
https://dx.doi.org/10.1007/s11227-016-1949-7
협약과제
16MH1700, (통합)스마트 네트워킹 핵심 기술 개발, 양선희
초록
As a representative large-scale data management technology, Apache Hadoop is an open-source framework for processing a variety of data such as SNS, medical, weather, and IoT data. Hadoop largely consists of HDFS, MapReduce, and YARN. Among them, we focus on improving the HDFS metadata management scheme responsible for storing and managing big data. We note that the current HDFS incurs many problems in system utilization due to its file-based metadata management. To solve these problems, we propose a novel metadata management scheme based on RDBMS for improving the functional aspects of HDFS. Through analysis of the latest HDFS, we first present five problems caused by its metadata management and derive three requirements of robustness, availability, and scalability for resolving these problems. We then design an overall architecture of the advanced HDFS, A-HDFS, which satisfies these requirements. In particular, we define functional modules according to HDFS operations and also present the detailed design strategy for adding or modifying the individual components in the corresponding modules. Finally, through implementation of the proposed A-HDFS, we validate its correctness by experimental evaluation and also show that A-HDFS satisfies all the requirements. The proposed A-HDFS significantly enhances the HDFS metadata management scheme and, as a result, ensures that the entire system improves its stability, availability, and scalability. Thus, we can exploit the improved distributed file system based on A-HDFS for various fields and, in addition, we can expect more applications to be actively developed.
키워드
Advanced HDFS, Distributed file systems, Hadoop, HDFS, Metadata management
KSP 제안 키워드
Ad hoc, Apache Hadoop, Big Data, Distributed File system, Functional Modules, Functional aspects, Open source, System utilization, database tables, design strategy, detailed design