ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper SMIT: Style-based Multi-level Feature Fusion for Image-to-Image Translation
Cited 0 time in scopus Share share facebook twitter linkedin kakaostory
Authors
Narae Yi, Minho Park, Dong-oh Kang
Issue Date
2024-10
Citation
International Conference on Information and Communication Technology Convergence (ICTC) 2024, pp.1987-1992
Publisher
IEEE
Language
English
Type
Conference Paper
DOI
https://dx.doi.org/10.1109/ICTC62082.2024.10826878
Abstract
Image-to-Image translation (121) involves converting images from one domain to another by learning a mapping that preserves the essential content while altering the image's appearance. Despite advancements in deep learning, particularly with convolutional neural networks, 121 tasks remain challenging due to issues like artifacts and difficulty in maintaining content integrity during style transformations. Recent models like Cy-cleGAN, MUNIT, and CUT have made strides in addressing these challenges, but limitations persist, particularly regarding diversity, realism, and precise style control. To overcome these limitations, we propose SMIT, a novel 121 model that leverages the strengths of the StyleGAN architecture combined with the Swin Transformer. Our model uses Swin- T as the encoder to extract multi-resolution features, which are then integrated into the StyleSwin model through a multi-level fuser module. This approach allows for effective embedding of contextual information, ensuring high-quality image generation with accurate domain transformations. Our model demonstrates superior performance over existing 121 methods on public datasets, validating its effectiveness through both qualitative and quantitative assessments.
KSP Keywords
Content integrity, Contextual information, Convolution neural network(CNN), Feature fusion, High-quality image, Multi-level feature, Multi-resolution, Public Datasets, deep learning(DL), image generation, neural network(NN)