DOI QR코드

DOI QR Code

Seal Detection in Scanned Documents

스캔된 문서에서의 도장 검출

  • Yu, Kyeonah (Dept. of Computer Science, Duksung Women's University) ;
  • Kim, Kyung-Hye (Dept. of Computer Science, Duksung Women's University)
  • 유견아 (덕성여자대학교 컴퓨터학과) ;
  • 김경혜 (덕성여자대학교 컴퓨터학과)
  • Received : 2013.09.27
  • Accepted : 2013.11.20
  • Published : 2013.12.31

Abstract

As the advent of the digital age, documents are often scanned to be archived or to be transmitted over the network. The largest proportion of documents is texts and the next is seal images indicating the author of the documents. While a lot of research has been conducted to recognize texts in scanned documents and commercialized text recognizing products are developed as highlighted the importance of the scanned document, information about seal images is discarded. In this paper, we study how to extract the seal image area from the color or black and white document containing the seal image and how to save the seal image. We propose a preprocessing step to remove other components except for the candidate outlines of the seal imprint from scanned documents and a method to select the final region of interest from these candidates by using the feature of seal images. Also in case of a seal imprint overlapped with texts, the most similar image among those stored in the database is selected through the template matching process. We verify the implemented system for a various type of documents produced in schools and analyze the results.

디지털 시대의 도래에 따라 문서들이 기록 보관되기 위해서 혹은 네트워크를 통해 전송되기 위해서 스캔되는 경우가 많아졌다. 스캔된 문서에서 가장 큰 비중을 차지하는 것은 텍스트이며 텍스트 이외에는 문서 작성자를 나타내는데 사용되는 도장이 가장 많이 포함되어 있다. 스캔된 문서의 중요성이 부각되면서 스캔된 문서로부터 텍스트를 인식하는 연구는 많이 진행되어 상용화된 제품도 개발된 것에 비해 문서가 포함하고 있는 도장에 대한 정보는 버려지고 있는 실정이다. 본 논문에서는 도장이 포함된 컬러 혹은 흑백 문서 영상에서 도장 영역을 검출하여 도장 이미지를 저장하는 방법을 연구한다. 스캔된 문서에서 도장의 외곽선 후보만을 남기고 다른 기타 구성 요소를 제거하는 전처리 과정과 도장의 특징을 이용하여 남은 요소 중에 최종 관심 영역을 선정하는 방법을 제안한다. 또한 검출된 관심 영역의 도장 정보가 텍스트와 겹친 이미지인 경우에는 템플릿 매칭을 통해 데이터베이스로부터 가장 유사한 도장을 찾아 대신 저장할 수 있게 한다. 구현된 시스템은 학교에서 일반적으로 생성되는 여러 유형의 문서들을 대상으로 검증하고 그 결과를 분석한다.

Keywords

References

  1. L.A. Fletcher and R. Kasturi, "A robust algorithm for text string separation from mixed text/graphics images", IEEE Transactions on Pattern Analysis and Machine Vol 10(6), pp 910-918, 1988. https://doi.org/10.1109/34.9112
  2. V. Wu, R. Manmatha, and E.M. Riseman, "Textfinder: an automatic system to detect and recognize text in images", IEEE Transactions on Pattern Analysis and Machine Intelligence Vol 21(11), pp 1224 -1229, 1999. https://doi.org/10.1109/34.809116
  3. J. Fitzpatrick, "Five Best Text Recognition Tools", http://lifehacker.com/5624781/five-best-text-recognition-tools, 2010.
  4. A. Soria-Frisch, "The fuzzy integral for color seal segmentation on document images", International Conference on Image Processing, vol. 1, pp. 157-160, 2003.
  5. B. Micenkova and J. van Beusekom, "Stamp Detection in Color Document Images", Proceedings of the International Conference on Document Analysis and Recognition, pp 1125-1129, 2011.
  6. P. Forczmanski, "Stamp detection in scanned documents", Annales UMCS, Informatica, pp 61-68, 2010.
  7. G. Zhu, S. Jaeger, and D. Doermann, "A Robust Stamp Detection Framework On Degraded Documents", Proceedings of the SPIE Conference on Document Recognition and Retrieval, pp 1-9, 2006.
  8. T. D. Pham, "Unconstrained logo detection in document images", Pattern Recognition 36 (12), pp. 3023-3025, 2003. https://doi.org/10.1016/S0031-3203(03)00125-0
  9. H. Liu, Y. Lu, and Q. Wu, "Automatic Seal Image Retrieval Method by Using Shape Features of Chinese Character", Systems, Man and Cybernetics, pp 2871-2876, 2007.
  10. P. Roy, U. Pal, and J. Llados, "Document Seal Detection Using GHT and Character Proximity Graphs", Pattern Recognition, pp. 1282-1295, Volume 44, issue 6, 2011. https://doi.org/10.1016/j.patcog.2010.12.004
  11. C. Ren, D. Liu, and Y. Chen, "A New Method on the Segmentation and Recognition of Chinese Characters for Automatic Chinese Seal Imprint Retrieval", Proceedings of the International Conference on Document Analysis and Recognition, pp 972-976, 2011.
  12. C. Ren and Y. Chen, "Chinese Payee Name Recognition Based on Seal Information of Chinese Bank Checks", International Conference on Frontiers in Handwriting Recognition, pp 538-541, 2012.
  13. X. Wang and Y. Chen, Seal Image Registration Based on Shape and Layout Characteristics, The 2nd International Congress on Image and Signal Processing, pp 1-5, 2009.
  14. M. Song and K. Han, "Development of a System for Recognizing Stamp Images", Journal of Korea Intelligent Information System, 9(1), pp 125-137, 2003.
  15. Y. Lim, I. Bak, J. Lee, K. Park, J. Kim, K. Kim, "Recognition of a Seal Image by Using Smoothing Method and ART1 Algorithm", Proceedings on Korea Multimedia Society, pp 17-22, 2002.
  16. G. Bradski, A. Kaehler, Learning OpenCV: Computer Vision with the OpenCV Library, 2nd Ed., Hanbit Media, 2010.