ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Real-Time Data Flow Language Processing System for Handling Streams of Data
Cited 0 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Choon Seo Park, Jin-Hwan Jeong, Myungcheol Lee, Yong-Ju Lee, Miyoung Lee, Sung Jin Hur
Issue Date
2014-09
Citation
International Conference on Scalable Information Systems (INFOSCALE) 2014, pp.97-106
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1007/978-3-319-16868-5_10
Abstract
Apache Pig system generates MapReduce jobs by compiling program scripts written in Pig Latin to process large data sets in parallel on distributed computing nodes. There are inefficient features in Pig due to the limitation of the MapReduce, e.g., the MapReduce is used only for batch processing. As various smart devices are extensively utilized recently, streams of data are generated explosively and the need to process streams of data in real-time is required. In this paper, we propose a data flow language processing system, called LAMA-CEP, by generating DAG-based stream processing services to process unbounded streams of data in real-time continuously. We present a stream processing language, called Pig Latin Stream extended from Pig Latin. Programs written in Pig Latin Stream are translated into distributed stream processing jobs and then the jobs are executed on a highly scalable distributed stream processing system to process large streams of data in real-time.
KSP Keywords
Apache Pig, Batch Processing, Language Processing, Large datasets, Real-time data flow, Smart devices, distributed computing, distributed stream processing system