DOI QR코드

DOI QR Code

A Study on the Automatic Digital DB of Boring Log Using AI

AI를 활용한 시추주상도 자동 디지털 DB화 방안에 관한 연구

  • Park, Ka-Hyun (Geotechnical Engineering Research Department, Korea Institute of Civil and Building Technology) ;
  • Han, Jin-Tae (Geotechnical Engineering Research Department, Korea Institute of Civil and Building Technology) ;
  • Yoon, Youngno (Bright Data LLC.)
  • 박가현 (한국건설기술연구원 지반연구본부) ;
  • 한진태 (한국건설기술연구원 지반연구본부) ;
  • 윤영노 (브라이트데이터)
  • Received : 2021.11.02
  • Accepted : 2021.11.08
  • Published : 2021.11.30

Abstract

The process of constructing the DB in the current geotechnical information DB system needs a lot of human and time resource consumption. In addition, it causes accuracy problems frequently because the current input method is a person viewing the PDF and directly inputting the results. Therefore, this study proposes building an automatic digital DB using AI (artificial intelligence) of boring logs. In order to automatically construct DB for various boring log formats without exception, the boring log forms were classified using the deep learning model ResNet 34 for a total of 6 boring log forms. As a result, the overall accuracy was 99.7, and the ROC_AUC score was 1.0, which separated the boring log forms with very high performance. After that, the text in the PDF is automatically read using the robotic processing automation technique fine-tuned for each form. Furthermore, the general information, strata information, and standard penetration test information were extracted, separated, and saved in the same format provided by the geotechnical information DB system. Finally, the information in the boring log was automatically converted into a DB at a speed of 140 pages per second.

국토지반정보 포털시스템에서 관리되는 지반정보는 사람이 직접 PDF 파일을 보고 일일이 타이핑을 해서 구축하고 있기 때문에 인적·시간적 자원 소모가 크며, 정확도 문제가 빈번하게 발생한다. 본 연구에서는 다양한 지반정보 중에서 국내에서 가장 일반적이고 널리 활용되고 있는 시추주상도를 대상으로 인공지능(Artificial Intelligence, AI)을 활용하여 자동 디지털 데이터베이스 구축하는 방안에 대해 제안하였다 우선, 다양한 시추주상도 양식에 대해서도 예외없이 데이터를 자동으로 데이터베이스화 하기 위해서 딥러닝모델 ResNet 34를 이용하여 시추주상도 양식분류를 하였으며, 총 6가지 시추주상도 양식에 대해 이미지 분류를 진행하여 전체 정확도(accuracy)는 99.7, ROC_AUC score는 1.0의 매우 높은 정확도로 시추주상도 양식을 분리할 수 있었다. 이 후, 각각의 양식에 대하여 미세조정(fine-tuning)된 로보틱 처리 자동화 기법을 이용하여 PDF 내 텍스트를 자동으로 읽어 들인 후 시추주상도 내 일반정보, SPT 시험정보 및 지층정보에 대해 데이터를 추출, 분리하여 이 값들을 기존 국토지반정보 포털시스템에서 제공하는 형태와 동일한 형태의 DB로 구축하도록 구현하였다. 최종적으로 기존 국토지반정보 포털시스템에서 제공하는 형태와 동일한 형태로 시추주상도내 정보를 초당 140페이지의 속도로 자동으로 DB화 할 수 있었다.

Keywords

Acknowledgement

본 연구는 본 연구는 한국건설기술연구원 주요사업인 "Tech-lead형 액상화 위험지도 구축기술 고도화 연구(4/4)" 연구과제의 지원을 받아 수행된 연구이며 이에 깊은 감사를 드립니다.

References

  1. Geotechnical Information DB System (2020), Brochure of Geotechnical Information DB System, Goyang, pp.7-8.
  2. Geotechnical Information DB System, accessed Oct 10, 2021, https://geoinfo.or.kr.
  3. Hanif, M. S. and Bilal, M. (2020), "Comp etitive Residual Neural Network for Image Classification", ICT Express, Vol.6, No.1, pp. 28-37. https://doi.org/10.1016/j.icte.2019.06.001
  4. He, K., Zhang, X., Ren, S., and Sun, J. (2016), "Deep Residual Learning for Image Recognition", In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.770-778.
  5. Ji, Y., Kim, H.S., Lee, M.G., Cho, H.I., and Sun, C.G. (2021), "MLP-based 3D Geotechnical Layer Mapping Using Borehole Database in Seoul, South Korea", Journal of the Korean Geotechnical Society, Vol.37, No.5, pp.47-63. https://doi.org/10.7843/KGS.2021.37.5.47
  6. Kang, G.H., Ko, J.H., Kwon, Y.J., Kwon, N.Y., and Koh, S.J. (2018), "A Study on Improvement of Korean OCR Accuracy using Deep Learning", Proceedings of the Korean Institute of Information and Communication Sciences Conference, 5, pp.693-695.
  7. Kang, B., Hwang, B., and Cho, W. (2018), "Empirical Estimations of Soil Constants Using Standard Penetration Test N Value", Journal of the Korean Geoenvironmental Society, Vol.19, No.6, pp.5-12. https://doi.org/10.14481/JKGES.2018.19.6.5
  8. Kim, S.W. and Lee, H.Y. (2020), "A Study on the Accumulation and Use of Corporate Records: Corporate Records Management as a Big Data Platform", Journal of Korean Society of Archives and Records Management, Vol.20, No.3, pp.99-118. https://doi.org/10.14404/JKSARM.2020.20.3.099
  9. Kong, H.S., Sul, J.W., Yoon, H.M., and Hwang, H.K. (2020), "Status and Issues of Machine Learning Data Construction: Focusing on Science and Technology", Kisti Issue Brief, 26, Korea Institute of Science and Technology Information.
  10. Korea Database Promotion Center (2019), The Guideline for Data Quality Management (ver.2.1).
  11. Lee, K.Y., Nam, G.H., Sim, J.C., Cho, K.S., and Ryu, W. (2012), "Construction of Knowledge Base for The Utilization of Big Data in Public Domain", Communications of the Korean Institute of Information Scientists and Engineers, Vol.30, No.6, pp.40-46.
  12. Lee, S.H., Hwang, Y.C., Chun, S.Y., and Jeong, J.S. (2005), "Study of Korea-made drilling log form decision", Proceedings of 2005 Spring Korean Geotechnical Society Conference, pp.1191-1198.
  13. National Disaster Management Research Institute (2021), "Development of Quality Control System of Geo-information and Construction of Liquefaction Hazard Map".
  14. Park, K.H., Han, J.T., and Kim, J.K. (2021), "Automated Quality Control Method for Geotechnical Information using Autoencoder", Proceedings of 2021 Fall Korean Geotechnical Society Conference, pp.261-262.
  15. Sokolova, M. and Lapalme, G. (2009), "A Systematic Analysis of Performance Measures for Classication Tasks", Information Processing & Management, Vol.45, No.4, pp.427-437. https://doi.org/10.1016/j.ipm.2009.03.002
  16. Sung, S.H., Lee, K.B., and Park, S.H. (2020), "Research on Korea Text Recognition in Images Using Deep Learning", Journal of Korea Convergence Society, Vol.11, No.6, pp.1-6. https://doi.org/10.15207/JKCS.2020.11.6.001
  17. Hong, P.D. (2019), "A Cognitive Automation Based Mobility RPA System", Proceedings of the Korean Institute of Information and Communication Sciences Conference, pp.351-354.
  18. Yoo, H. (2020), "A Study on Edu-Tech Using Open Data in the Digital New Deal Era", Journal of Next-generation Convergence Technology Association, Vol.4, No.4, pp.367-373. https://doi.org/10.33097/jncta.2020.04.04.367