ETRI-Knowledge Sharing Plaform

KOREAN
논문 검색
Type SCI
Year ~ Keyword

Detail

Conference Paper Generating Cartoon Scene Images with Latent Diffusion Models
Cited - time in scopus Share share facebook twitter linkedin kakaostory
Authors
SangEun Lee, Wonseok Chae, Hyeon-Jin Kim
Issue Date
2023-07
Citation
International Conference on Multimedia Information Technology and Applications (MITA) 2023, pp.11-14
Language
English
Type
Conference Paper
Abstract
Recent text-to-image models have shown the superior performance in generating realistic images. In this paper, we propose a method of generating cartoon scene images by fine-tuning the text-to-image diffusion model. To enable the model to generate a cartoon scene in which a specific style of cartoon character appears, we fine-tuned the model using Dreambooth in order for the model to learn the cartoon characters. At inference stage, we focused on the image-to-image method of translating a given reference image into a new image under the instruction of text prompt. Moreover, we additionally adopted ControlNet and latent couple at the inference stage, where ControlNet enables the model to generate the cartoon character doing the same pose with the human in the reference image, while the latent couple technique allows users to designate the positions of each character in the image. From the results, it is demonstrated that users can generate not only cartoon scene images of a character in various contexts, including different facial expressions and poses, but also cartoon scene images of multiple characters.
KSP Keywords
Cartoon Character, Cartoon Scene, Diffusion Model, Facial expression, Fine-tuning, Image diffusion, Image method, Reference Image, Scene images, image models, superior performance