• Title/Summary/Keyword: Postal Envelope Images

Search Result 3, Processing Time 0.017 seconds

Automatic Generation of Training Character Samples for OCR Systems

  • Le, Ha;Kim, Soo-Hyung;Na, In-Seop;Do, Yen;Park, Sang-Cheol;Jeong, Sun-Hwa
    • International Journal of Contents
    • /
    • v.8 no.3
    • /
    • pp.83-93
    • /
    • 2012
  • In this paper, we propose a novel method that automatically generates real character images to familiarize existing OCR systems with new fonts. At first, we generate synthetic character images using a simple degradation model. The synthetic data is used to train an OCR engine, and the trained OCR is used to recognize and label real character images that are segmented from ideal document images. Since the OCR engine is unable to recognize accurately all real character images, a substring matching method is employed to fix wrongly labeled characters by comparing two strings; one is the string grouped by recognized characters in an ideal document image, and the other is the ordered string of characters which we are considering to train and recognize. Based on our method, we build a system that automatically generates 2350 most common Korean and 117 alphanumeric characters from new fonts. The ideal document images used in the system are postal envelope images with characters printed in ascending order of their codes. The proposed system achieved a labeling accuracy of 99%. Therefore, we believe that our system is effective in facilitating the generation of numerous character samples to enhance the recognition rate of existing OCR systems for fonts that have never been trained.

An Implementation Method of the Character Recognizer for the Sorting Rate Improvement of an Automatic Postal Envelope Sorting Machine (우편물 자동구분기의 구분율 향상을 위한 문자인식기의 구현 방법)

  • Lim, Kil-Taek;Jeong, Seon-Hwa;Jang, Seung-Ick;Kim, Ho-Yon
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.4
    • /
    • pp.15-24
    • /
    • 2007
  • The recognition of postal address images is indispensable for the automatic sorting of postal envelopes. The process of the address image recognition is composed of three steps-address image preprocessing, character recognition, address interpretation. The extracted character images from the preprocessing step are forwarded to the character recognition step, in which multiple candidate characters with reliability scores are obtained for each character image extracted. aracters with reliability scores are obtained for each character image extracted. Utilizing those character candidates with scores, we obtain the final valid address for the input envelope image through the address interpretation step. The envelope sorting rate depends on the performance of all three steps, among which character recognition step could be said to be very important. The good character recognizer would be the one which could produce valid candidates with very reliable scores to help the address interpretation step go easy. In this paper, we propose the method of generating character candidates with reliable recognition scores. We utilize the existing MLP(multilayered perceptrons) neural network of the address recognition system in the current automatic postal envelope sorters, as the classifier for the each image from the preprocessing step. The MLP is well known to be one of the best classifiers in terms of processing speed and recognition rate. The false alarm problem, however, might be occurred in recognition results, which made the address interpretation hard. To make address interpretation easy and improve the envelope sorting rate, we propose promising methods to reestimate the recognition score (confidence) of the existing MLP classifier: the generation method of the statistical recognition properties of the classifier and the method of the combination of the MLP and the subspace classifier which roles as a reestimator of the confidence. To confirm the superiority of the proposed method, we have used the character images of the real postal envelopes from the sorters in the post office. The experimental results show that the proposed method produces high reliability in terms of error and rejection for individual characters and non-characters.

  • PDF

Postal Envelope Image Recognition System for Postal Automation (서장 우편물 자동처리를 위한 우편영상 인식 시스템)

  • Kim, Ho-Yon;Lim, Kil-Taek;Kim, Doo-Sik;Nam, Yun-Seok
    • The KIPS Transactions:PartB
    • /
    • v.10B no.4
    • /
    • pp.429-442
    • /
    • 2003
  • In this paper, we describe an address image recognition system for automatic processing of standard- size letter mail. The inputs to the system are gray-level mail piece images and the outputs are delivery point codes with which a delivery sequence of carrier can be generated. The system includes five main modules; destination address block location, text line separation, character segmentation, character recognition and finally address interpretation. The destination address block is extracted on the basis of experimental knowledge and the line separation and character segmentation is done through the analysis of connected components and vortical runs. For recognizing characters, we developed MLP-based recognizers and dynamical programming technique for interpretation. Since each module has been implemented in an independent way, the system has a benefit that the optimization of each module is relatively easy. We have done the experiment with live mail piece images directly sampled from mail sorting machine in Yuseong post office. The experimental results prove the feasibility of our system.