ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Constrained Approximate Query Processing with Error and Response Time-Bound Guarantees for Efficient Big Data Analytics
Cited 0 time in scopus Download 199 time Share share facebook twitter linkedin kakaostory
Authors
Sungsoo Kim, Choon Seo Park, Taewhi Lee, Kihyuk Nam
Issue Date
2024-06
Citation
International Symposium on High-Performance Parallel and Distributed Computing (HPDC) 2024, pp.373-376
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1145/3625549.3658824
Abstract
Approximate query processing (AQP) is a technique for obtaining approximate answers to queries over large datasets. AQP techniques trade off accuracy for speed, making them ideal for scenarios where exact answers are not required or the cost of obtaining exact answers is prohibitive. This paper proposes a novel machine learning (ML)-based AQP framework that leverages both generative and inferential ML models to improve accuracy and efficiency. The framework first constructs a generative ML model that learns the underlying data distribution and then generates synthetic data that follows the same distribution. The proposed framework also includes a mechanism for constrained approximate query processing (CAQP) with bounded errors and bounded response times. This allows users to specify the desired error bound for the results of an aggregation query. The framework then selects a subset of the synthetic data that is guaranteed to satisfy the error bound. An evaluation of the proposed framework using the Instacart benchmark dataset and queries demonstrates significant efficiency improvements in AQP compared to existing techniques.
KSP Keywords
Accuracy and efficiency, Aggregation query, Approximate query processing, Benchmark datasets, Data Distribution, Large datasets, Machine learning (ml), Synthetic data, Trade-off, big data analytics, bounded errors
This work is distributed under the term of Creative Commons License (CCL)
(CC BY)
CC BY