History Document Image Background Noise and Removal Methods

  • Ganchimeg, Ganbold
  • Received : 2015.02.14
  • Accepted : 2015.04.19
  • Published : 2015.12.30


It is common for archive libraries to provide public access to historical and ancient document image collections. It is common for such document images to require specialized processing in order to remove background noise and become more legible. Document images may be contaminated with noise during transmission, scanning or conversion to digital form. We can categorize noises by identifying their features and can search for similar patterns in a document image to choose appropriate methods for their removal. In this paper, we propose a hybrid binarization approach for improving the quality of old documents using a combination of global and local thresholding. This article also reviews noises that might appear in scanned document images and discusses some noise removal methods.


Binarization;History Document Noise;Noise Removal Algorithms


  1. Bao-ping, W., Huai-liang, L., Nan-jing, L., & Wei-xin, X. (2005). A novel adaptive image fuzzy enhancement algorithm. Xi'an, 32, 307-313.
  2. Bernsen, J. (1986). Dynamic thresholding of gray-level images, Proceedings 8th International Conference on Pattern Recognition, Paris, 1251-1255.
  3. Deborah, H., & Arymurthy, A. (2010). Image Enhancement and Image Restoration for Old Document Image Using Genetic Algorithm. 2010 Second International Conference on Advances in Computing, Control, and Telecommunication Technologies, 108-112.
  4. Fan, K., Wang, Y., & Lay, T. (2002). Marginal noise removal of document images. Pattern Recognition, 35(11), 2593-2611.
  5. Farahmand, A., Sarrafzadeh, A., & Shanbehzadeh, J. (2013). Document Image Noises and Removal Methods. Proceedings of the International MultiConference of Engineers and Computer Scientists 2013, 1, 436-440.
  6. Feng, M., & Tan, Y. (2004). Adaptive binarization method for document image analysis. 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763), 1, 339-342.
  7. Fisher, R., Perkins, S., Walker, A., & Wolfart, E. (2003). Histogram Equalization. Retrieved January 16, 2015, from
  8. Ganchimeg, G. (2013). Application exhibits of historical virtual museum. ICEIC 2013, 188-190.
  9. Ganchimet, G., Turbat, R. (2014). Detection of Edges in Color Images. Journal of IEEK Transactions on Smart Processing and Computing, 3(6), 345-352.
  10. Gatos, B., Pratikakis, I., & Perantonis, S. (2004). An Adaptive Binarization Technique for Low Quality Historical Documents. Document Analysis Systems VI Lecture Notes in Computer Science, 3163, 102-113.
  11. Gatos, B., Pratikakis, I., & Perantonis, S. (2006). Adaptive degraded document image binarization. Pattern Recognition, 39, 317-327.
  12. Gonzales, R. C., & Woods, R. E. (2002). Digital Image Processing 2nd Edition. New Jersey: Prentice-Hall.
  13. Hao, N. B. (2008). Fuzzy enhancement algorithm based on rough fuzzy sets theory for the medical volumetric data, Micro-electron. Com put, 25, 137-140.
  14. Kim, J., Kim, L., & Hwang, S. (2001). An advanced contrast enhancement using partially overlapped sub-block histogram equalization. IEEE Trans. Circuits Syst. Video Technol. IEEE Transactions on Circuits and Systems for Video Technology, 11, 475-484.
  15. Kohmura, H., & Wakahara, T. (2006). Determining Optimal Filters for Binarization of Degraded Characters in Color Using Genetic Algorithms. 18th International Conference on Pattern Recognition (ICPR'06), 3, 661-664.
  16. Kuppannan, J., Rangasamy, P., Thirupathi, D., & Palaniappan, N. (2006). Intuitionistic Fuzzy Approach to Enhance Text Documents. 2006 3rd International IEEE Conference Intelligent Systems, 733-737.
  17. Ming, L., Xie, G., & Wang, Y. (2008). Fuzzy enhancement algorithm based on rough fuzzy sets theory for the medical volumetric data. Micro-electron. Com Put, 25, 137-140.
  18. Niblack, W. (1986). In An introduction to digital image processing. Englewood Cliffs (p. 198), N.J.: Prentice-Hall International.
  19. Nomura, S., Yamanaka, K., Shiose, T., Kawakami, H., & Katai, O. (2009). Morphological preprocessing method to thresholding degraded word images. Pattern Recognition Letters, 30(8), 729-744.
  20. Otsu, N. (1979). A threshold selection method form gray-level histograms. Proceedings of the 1986 IEEE Transactions Systems, 9(1), 62-66.
  21. Paulinas, M., & Usinskas, A. (2007). A survey of Genetic Algorithms Applications for Image Enhancement and Segmentation. Information Technology and Control, 36(3), 278-284.
  22. Peerawit, W., & Kawtrakul, A. (2004). Marginal Noise Removal from Document Images Using Edge Density. Proceedings of Fourth Information and Computer Eng. Postgraduate Workshop.
  23. Said, J., Cheriet, M., & Suen, C. (1996). Dynamical morphological processing: A fast method for base line extraction. Proceedings of 13th International Conference on Pattern Recognition, 2, 8-12.
  24. Sauvola, J., & Pietikainen, M. (2000). Adaptive document image binarization. Pattern Recognition, 33(2), 225-236.
  25. Shafait, F., Beusekom, J., Keysers, D., & Breuel, T. (2008). Document cleanup using page frame detection. IJDAR International Journal of Document Analysis and Recognition (IJDAR), 11(2), 81-96.
  26. Shafait, F., & Breuel, T. (2009). A simple and effective approach for border noise removal from document images. 2009 IEEE 13th International Multitopic Conference, 126-137.
  27. Zadeh, L. (1965). Fuzzy Sets. Information and Control, 8, 338-353.
  28. Zhang, Z., & Tan, C. (2001). Recovery of distorted document images from bound volumes. Proceedings of Sixth International Conference on Document Analysis and Recognition, 429-433.