ETRI-Knowledge Sharing Plaform

논문 검색
Type SCI
Year ~ Keyword


Conference Paper Extraction of Character Areas from Digital Camera Based Color Document Images and OCR System
Cited 3 time in scopus Share share facebook twitter linkedin kakaostory
Y.K. Chung, S.Y. Chi, K.S. Bae, K.K. Kim, D.Jang, K.C.Kim, Y.W.Choi
Issue Date
Optical Information Systems III (SPIE 5908), v.5908, pp.1-12
Conference Paper
When document images are obtained from digital cameras, many imaging problems have to be solved for better extraction of characters from the images. Variation of illumination intensity sensitively affects to color values. A simple colored document image could be converted to a monochrome image by a traditional method and then a binarization algorithm is used. But this method is not stably working to the variation of illumination because sensitivity of colors to variation of illumination. For narrowly distributed colors, the conversion is not working well. Secondly, in case that the number of colors is more than two, it is not easy to figure out which color is for character and which others are for background. This paper discusses about an extraction method from a colored document image using a color process algorithm based on characteristics of color features. Variation of intensities and color distribution are used to classify character areas and background areas. A document image is segmented into several color groups and similar color groups are merged. In final step, only two colored groups are left for the character and background. The extracted character areas from the document images are entered into optical character recognition system. This method solves a color problem, which comes from traditional scanner based OCR systems. This paper also describes the OCR system for character conversion of a colored document image. Our method is working for the colored document images of cellular phones and digital cameras in real world.
KSP Keywords
Binarization algorithm, Color features, Color values, Digital camera, Document images, Extraction method, Illumination intensity, OCR System, Optical character recognition, Real-world, Recognition System