DOI QR코드

DOI QR Code

Design and Implementation of OCR Correction Model for Numeric Digits based on a Context Sensitive and Multiple Streams

제한적 문맥 인식과 다중 스트림을 기반으로 한 숫자 정정 OCR 모델의 설계 및 구현

  • 신현경 (경원대학교 수학정보학과)
  • Received : 2010.09.10
  • Accepted : 2010.10.25
  • Published : 2011.02.28

Abstract

On an automated business document processing system maintaining financial data, errors on query based retrieval of numbers are critical to overall performance and usability of the system. Automatic spelling correction methods have been emerged and have played important role in development of information retrieval system. However scope of the methods was limited to the symbols, for example alphabetic letter strings, which can be reserved in the form of trainable templates or custom dictionary. On the other hand, numbers, a sequence of digits, are not the objects that can be reserved into a dictionary but a pure markov sequence. In this paper we proposed a new OCR model for spelling correction for numbers using the multiple streams and the context based correction on top of probabilistic information retrieval framework. We implemented the proposed error correction model as a sub-module and integrated into an existing automated invoice document processing system. We also presented the comparative test results that indicated significant enhancement of overall precision of the system by our model.

재무 데이터 관리를 위한 자동화된 비지니스 서류 영상 처리 시스템에서 숫자 정보 검색 중 발생한 오류는 심각하여 그 시스템의 가용성 및 성능을 결정한다. 그 동안 자동 맞춤법 교정에 관한 방법론들이 개발되어 정보 검색 시스템 개발에 중요한 역할을 해왔으나 이러한 맞춤법 교정은 알파벳 등 기계학습이 가능하고 사전 형태로 보관이 가능한 기호에 한정되어왔다. 반면에 순수한 마코프 수열에 불과한 숫자들의 순열들은 맞춤법 교정을 위하여 사전적 형태로 보관하여 활용하는 것이 불가능 하다. 본 논문에서는 확률론적 정보 검색 알고리즘의 토대위에 제한적 문맥 인식과 복수의 스트림을 적용한 새로운 형태의 숫자 정정 OCR 모델을 제안하였다. 본 논문에서 제안된 숫자 정정 모델은 기존의 송장 문서 처리 시스템에 구현하였으며 제안된 숫자 정정 모델의 효과를 확인하기 위해 비교 테스트를 실행하였고 테스트 결과 상당한 성능이 개선되었음을 보여 주었다.

Keywords

References

  1. C. D. Manning, P. Raghavan, and H. Schultze, “An Introduction to Information Retrieval”, Cambridge University Press, 2008.
  2. R. Kosala and H. Blockeel, Web Mining Research: A Survey ACM SIGKDD Explorations Newsletter, vol. 2, no. 1, pp. 1-15, 2000. https://doi.org/10.1145/360402.360406
  3. C. Mascolo, “Specification, analysis and prototyping of mobile code systems”, PhD thesis, Universita di Bologna, 2001.
  4. A. Perez, F. Rodriguez, and B. Terrazas, “Ontology based legal information retrieval to improve the information access in e-government”, IWWW conf. Proc. 15th ICWWW, 2006.
  5. E. L. Rissland, J.J. Daniels, “A hybrid CBR-IR approach to legal information retrieval”, ICAIL, Proc. 5th ICAIL, pp52-61, 1995.
  6. T. Honkela, S. Kaski, K. Lagus, T. Kohonen, “WEBSOM – self organizing maps of document collections”, Proceedings of WSOM, pp.310-315, 1997.
  7. H. Li, D. Doermann, O. Kia, "Automatic Text Detection and Tracking in Digital Video," IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL 9, PART 1, pages 147-155, 2000. https://doi.org/10.1109/83.817607
  8. Rasagna, V., Kumar, A., Jawahar, C. V., and Manmatha, R. “Robust Recognition of Documents by Fusing Results of Word Clusters,” ICDAR. IEEE, 566-570. 2009.
  9. Li, L. and Tan, C. L., “Improving OCR Text Categorization Accuracy with Electronic Abstracts,” DIAL. IEEE, 82-87, 2006.
  10. Xu, Y. and Nagy, G. “Prototype Extraction and Adaptive OCR,” IEEE Trans. Pattern Anal. Mach. Intell. 21, 1280-1296, 1999. https://doi.org/10.1109/34.817408
  11. Avi-Itzhak, Hadar I. and Diep, Thanh A. and Garland, Harry, “High Accuracy Optical Character Recognition Using Neural Networks with Centroid Dithering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, 1995. https://doi.org/10.1109/34.368165
  12. F. Drira, F. LeBourgeois, H. Emptoz, "Document Images Restoration by a New Tensor Based Diffusion Process: Application to the Recognition of Old Printed Documents," ICDAR, pp. 321-325, 2009.
  13. U. Garain, M. P. Chakraborty, D. Majumder, "Improvement of OCR Accuracy by Similar Character Pair Discrimination: an Approach based on Artificial Immune System," The 18th ICPR'06, 2006.
  14. Garain, U., Jain, A., Maity, A., Chanda, B., “Machine reading of camera held low quality text images: An ICA based image enhancement approach for improving OCR accuracy,”, ICPR08(1-4)., 2008.
  15. Koga, M., Mine, R., Kameyama, T., Takahashi, T., Yamazaki, M., and Yamaguchi, T., “Camera-based Kanji OCR for Mobile-phones: Practical Issues,” ICDAR. IEEE, 635-639, 2005.
  16. K. Shin, B. Kang, and K. Park, “Super-resolution Iris Image Restoration using Single Image for Iris Recognition”, KSII Trans. Internet and Information System, v. 4, no. 2, 2010. https://doi.org/10.3837/tiis.2010.04.003
  17. F. Daniyal, M. Taj, and A. Cavallaro, "Content and task-based view selection from multiple video streams," Multimedia Tools Appl., v. 46, no. 2-3, pp. 235-258, 2010. https://doi.org/10.1007/s11042-009-0355-z