ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Rethinking Data Bias: Dataset Copyright Protection via Embedding Class-wise Hidden Bias
Cited 1 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Jinhyeok Jang, ByungOk Han, Jaehong Kim, Chan-Hyun Youn
Issue Date
2024-10
Citation
European Conference on Computer Vision (ECCV) 2024, pp.1-17
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1007/978-3-031-72664-4_1
Abstract
Public datasets play a crucial role in advancing data-centric AI, yet they remain vulnerable to illicit uses. This paper presents ‘under-cover bias,’ a novel dataset watermarking method that can reliably identify and verify unauthorized data usage. Our approach is inspired by an observation that trained models often inadvertently learn biased knowledge and can function on bias-only data, even without any information directly related to a target task. Leveraging this, we deliberately embed class-wise hidden bias via unnoticeable watermarks, which are unrelated to the target dataset but share the same labels. Consequently, a model trained on this watermarked data covertly learns to classify these watermarks. The model’s performance in classifying the watermarks serves as irrefutable evidence of unauthorized usage, which cannot be achieved by chance. Our approach presents multiple benefits: 1) stealthy and model-agnostic watermarks; 2) minimal impact on the target task; 3) irrefutable evidence of misuse; and 4) improved applicability in practical scenarios. We validate these benefits through extensive experiments and extend our method to fine-grained classification and image segmentation tasks. Our implementation is available at here (https://github.com/jjh6297/ UndercoverBias).
KSP Keywords
Copyright protection, Data-centric, Fine grained(FG), Multiple benefits, Novel dataset, Public Datasets, data usage, fine-grained classification, image segmentation, watermarked data(Stego-image)