DOI QR코드

DOI QR Code

A Classification of Medical and Advertising Blogs Using Machine Learning

머신러닝을 이용한 의료 및 광고 블로그 분류

  • Lee, Gi-Sung (Division of Computer & Game, Howon University) ;
  • Lee, Jong-Chan (Department of Computer Information Engineering, Kunsan National University)
  • 이기성 (호원대학교 컴퓨터게임.학부) ;
  • 이종찬 (군산대학교 컴퓨터정보공학과)
  • Received : 2018.09.29
  • Accepted : 2018.11.02
  • Published : 2018.11.30

Abstract

With the increasing number of health consumers aiming for a happy quality of life, the O2O medical marketing market is activated by choosing reliable health care facilities and receiving high quality medical services based on the medical information distributed on web's blog. Because unstructured text data used on the Internet, mobile, and social networks directly or indirectly reflects authors' interests, preferences, and expectations in addition to their expertise, it is difficult to guarantee credibility of medical information. In this study, we propose a blog reading system that provides users with a higher quality medical information service by classifying medical information blogs (medical blog, ad blog) using bigdata and MLP processing. We collect and analyze many domestic medical information blogs on the Internet based on the proposed big data and machine learning technology, and develop a personalized health information recommendation system for each disease. It is expected that the user will be able to maintain his / her health condition by continuously checking his / her health problems and taking the most appropriate measures.

행복한 삶의 질을 목적으로 하는 의료소비자가 증가하면서 웹에 분산되어 있는 블로그의 의료 정보를 바탕으로 신뢰성 있는 의료 시설을 선택하고 고품질의 의료 서비스를 받음으로서, 시간과 비용을 절약할 수 있는 O2O 의료 마케팅 시장이 활성화 되고 있다. 인터넷, 모바일, SNS 등에서 증가하는 비정형 텍스트 데이터는 전문 의료 지식 이외에 작성자의 관심, 선호, 예상 등을 직간접적으로 반영하고 있기 때문에 의료정보의 신뢰성을 담보하기 어렵다. 본 연구에서는 빅데이터 및 MLP를 사용하여 의료정보 블로그를 분류 (의료블로그, 광고블로그)함으로서 사용자에게 보다 고품질의 의료정보 서비스를 제공하는 블로그 판단 시스템을 제안한다. 제안된 빅데이터 및 머신러닝 기술을 통해 인터넷상에 존재하는 국내의 다수 의료정보 블로그를 종합, 분석한 후 질환별 개인 맞춤형 건강정보 추천 시스템을 개발한다. 이를 통하여 사용자는 자신의 건강문제를 지속적으로 점검하고 가장 적절한 조치를 취함으로서 자신의 건강 상태를 유지하는 것이 가능할 것으로 기대된다.

Keywords

SHGSCZ_2018_v19n11_730_f0001.png 이미지

Fig. 1. Structure of big data system

SHGSCZ_2018_v19n11_730_f0002.png 이미지

Fig. 2. Storage structure for blog data

SHGSCZ_2018_v19n11_730_f0003.png 이미지

Fig. 3. Structure of data in MongoDB

SHGSCZ_2018_v19n11_730_f0004.png 이미지

Fig. 4. Data processing structure

SHGSCZ_2018_v19n11_730_f0005.png 이미지

Fig. 5. Structure of TF-IDF

Table 1. Collection/processing/analysis of a blog

SHGSCZ_2018_v19n11_730_t0001.png 이미지

Table 2. Parameters for determining blog type

SHGSCZ_2018_v19n11_730_t0002.png 이미지

Table 3. Parameter in the MLP

SHGSCZ_2018_v19n11_730_t0003.png 이미지

Table 4. Medical dictionary by disease

SHGSCZ_2018_v19n11_730_t0004.png 이미지

Table 5. Test items and performance objectives

SHGSCZ_2018_v19n11_730_t0005.png 이미지

References

  1. Y. Y Ou, P. Y Shih, Y. H. Chin, T. W. Kuan, "Framework of ubiquitous healthcare system based on cloud computing for elderly living," Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014. DOI: https://doi.org/10.1109/APSIPA.2013.6694298
  2. J. Y. Lee, K. D. Jung, "Proposed Architecture for U-Healthcare Systems," Advanced Culture Technology, Vol. 4, No. 2, pp. 43-46, 2016. DOI: https://doi.org/10.17703/IJACT.2016.4.2.43
  3. Y. E. Gelogo and H. K. Kim, "Integration of Mobile Computing to Ubiquitous Healthcare," Software Engineering and Its Applications, Vol. 9, No. 9, pp. 295-302, 2015. DOI: https://doi.org/10.14257/ijseia.2015.9.9.26
  4. M. Rostami, S. Ayat, I. Attarzadeh, and F. Saghari, "Proposing a Method to Classify Texts Using Data Mining," Advances In Computer Research, Vol. 6, No. 4, pp. 125-137, 2015.
  5. O. H. Shin, "Demystifying Big Data: Anatomy of Big Data Developmental Process," Telecommunication Policy (ELSEVIER), 2015. DOI: https://doi.org/10.1016/j.telpol.2015.03.007
  6. P. Mohata and S. Dhande, "Web Data Mining Techniques and Implementation for Handling Big Data," Computer Science and Mobile Computing, Vol. 4, No. 4, pp. 330-334, 2015.
  7. E. Ferrara and P. Meo, Giacomo Fiumara, Robert Baumgartner, "Web Data Extraction, Applications and Techniques: A Survey," Knowledge-Based Systems (ELSEVIER), pp. 301-323, 2014. DOI: https://doi.org/10.1016/j.knosys.2014.07.007
  8. Y. Li, A. Algarni, M. Albathan, Y. Shen, and M. Arif Bijaksana, "Relevance Feature Discovery for Text Mining," IEEE Transactions on Knowledge and Data Engineering, Vol. 6, No. 1, 2015. DOI: https://doi.org/10.1109/TKDE.2014.2373357
  9. B. Dayley, "Node.js, MongoDB, and AngularJS Web Development", Addison-Wesley, 2014.
  10. A. S. Oh, "A Study on Design of Health Device for U-Health System," Bio-Science and Bio-Technology, Vol. 7, No. 2, pp. 79-86, 2015.
  11. F. N. Afrati and J. D. Ullman, "Optimizing Multiway Joins in a Map-Reduce Environment," IEEE Transactions on Knowledge and Data Engineering, Vol.23, No.9, pp.1282-1298, 2011. DOI: https://doi.org/10.1109/TKDE.2011.47
  12. B. Singh and S. Kumar, "An Effective Information Retrieval with Keyword Optimization," Computer Technology and Applications, Vol. 5, No. 1, pp. 174-177, 2014.
  13. J. C. Lee and K. J. Park, "Design of Food Management System Using NFC Tag," Journal of korean society of computer and information, Vol. 23, No. 5, pp. 25-29, 2018.