ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술대회 BWA-MEM-SCALE: Accelerating Genome Sequence Mapping on Commodity Servers
Cited 1 time in scopus Download 29 time Share share facebook twitter linkedin kakaostory
저자
김창대, 고광원, 김태훈, 한대규, 서지원
발행일
202208
출처
International Conference on Parallel Processing (ICPP) 2022, pp.1-12
DOI
https://dx.doi.org/10.1145/3545008.3545033
협약과제
22ZS1300, 인공지능 처리성능 한계를 극복하는 고성능 컴퓨팅 기술 연구, 김강호
초록
As advances in Next-Generation Sequencing have made genome sequence data generation faster and cheaper, the acceleration of genome sequence mapping to the reference genome becomes an increasingly important problem. Much effort has been made to improve the performance of the sequence mapping process. In this paper, we propose BWA-MEM-SCALE which offers software-based acceleration techniques that fully utilize system resources to speed up genome sequence mapping. BWA-MEM-SCALE has two optimization mechanisms that exploit the system memory resource; Exact Match Filter (EMF) finds the input reads that match in full-length to the reference genome so that the expensive mapping process is bypassed for those reads. FM-index Accelerator (FMA) skips the prefix of sequences in seed matching with pre-assembled data. Moreover, we fully utilize the CPU cores in the system by carefully pipelining the mapping process and using in-memory index store. We implement the proposed mechanisms on BWA-MEM2 which is the state-of-the-art sequence mapping software. The evaluation shows that BWA-MEM-SCALE achieves substantial speedup compared to BWA-MEM2 when the system has a sufficient amount of resources. For example, with additional 104GB of memory, BWA-MEM-SCALE gives up to 3.32X speedup over BWA-MEM2. Because we support partially deploying the acceleration techniques, BWA-MEM-SCALE speeds up the mapping performance in proportion to the available system resource. Source-code: https://github.com/etri/bwa-mem-scale
KSP 제안 키워드
BWA-MEM, Data generation, Exact match, FM-index, Genome sequence, Next-Generation Sequencing(NGS), Reference genome, Sequence data, Sequence mapping, Source Code, Speed-up