DOI QR코드

DOI QR Code

Classification of Web Search Engines and Necessity of a Hybrid Search Engine

웹 검색엔진 분류 및 하이브리드 검색엔진의 필요성

  • Paik, Juryon (Department of Digital Information & Statistics, Pyeongtaek University)
  • 백주련 (평택대학교 데이터정보학과)
  • Received : 2018.03.29
  • Accepted : 2018.04.25
  • Published : 2018.04.30

Abstract

Abstract In 2017, it has been reported that Google had more than 90% of the market share in search-engines of desktops and mobiles. Most people may consider that Google surely searches the entire web area. However, according to many researches for web data, Google only searches less than 10%, surprisingly. The most region is called the Deep Web, and it is indexable by special search engines, which are different from Google because they focus on a specific segment of interest. Those engines build their own deep-web databases and run particular algorithms to provide accurate and professional search results. There is no search engine that indexes the entire Web, currently. The best way is to use several search engines together for broad and efficient searches as best as possible. This paper defines that kind of search engine as Hybrid Search Engine and provides characteristics and differences compared to conventional search engines, along with a frame of hybrid search engine.

2017년 기준, 데스크탑과 모바일 영역에서 90% 이상의 압도적인 점유율을 보이는 검색엔진은 Google로써, 대다수의 사람들은 Google 이 검색하는 영역이 웹의 전체라고 생각할 것이다. 그러나 웹 연구 결과에 의하면 전체 웹 데이터의 불과 10% 만이 Google에 의해 검색가능하다고 한다. 대부분의 영역은 딥 웹이라고 불리며 Google 과는 다른 종류의 검색엔진들에 의해 검색된다. 해당 엔진들은 자신만의 딥 웹 데이터베이스를 구축 후 특화된 알고리즘을 사용하여 높은 정확성과 전문성의 검색결과를 제공한다. 현재 사용되고 있는 검색엔진들 중, 전체 웹 영역을 검색하는 엔진은 존재하지 않는다. 광범위에 걸쳐 그리고 유효하면서 정확 신속한 검색을 수행하기 위한 최적의 방법은 Google 같은 일반적인 검색엔진과 딥 웹 검색엔진들을 동시에 적용하여 결과를 도출하는 것이다. 본 논문에서는 이러한 검색엔진을 하이브리드검색엔진이라 명하고 기존 검색엔진들에 비해 갖는 차이점 및 특징에 대해 살펴본 후 개괄적인 프레임을 제시한다.

Keywords

References

  1. Total number of websites. Available: http://www.internetlivestats.com/total-number-of-websites/
  2. C. Asselin, Discover and exploit the invisible web for competitive intelligence, Digimind, New York, 2006.
  3. M. K. Bergman (2001, August). "The deep web: surfacing hidden value," The Journal of Electronic Publishing [Online]. 7(1). Available: https://brightplanet.com/2012/06/the-deep-web-surfacing-hidden-value/
  4. B. He, M. Patel, Z. Zhang, and K. C.-C. Chang, "Accessing the deep web," Communications of the ACM, Vol. 50, No. 5, pp. 95-101, May 2007.
  5. Y. Ru and E. Horowitz, "Indexing the invisible web: a survey," Online Information Review, Vol. 29, No. 3, pp. 249-265, 2005. https://doi.org/10.1108/14684520510607579
  6. A. Alba, V. Bhagwan, and T. Grandison, "Accessing the deep web: when good ideas go bad," in Proceedings Companion to the 23rd ACM SIGPLAN Conference on Object-Oriented Programming Systems Languages and Applications, Nashville, USA, pp. 815-818, October 19-23, 2008.
  7. A. Ghani (2017, March). How to access the deep web safely [Internet]. Available: http://www.techglows.com/access-deep-web-safely/.
  8. S. Lawewnce and CL. Giles, "Searching the world wide web," Science Magazine, Vol. 280, No. 5360, pp. 98-100, April 1998.
  9. A. Gulli and A. Signorini, "The indexable web is more than 11.5 billion pages," in Proceedings of the 14th International Conference on World Wide Web, Chiba, Japan, pp. 902-903, May 10-14, 2005.
  10. W. B. Croft, D. Metzler, and T. Strohman, Search engines information retrieval in practice, Pearson, pp. 1-28, 2009.
  11. ATLAS Research & Consulting, The global trends for the post-Google and the requirements for the next generation of the search engines, DigiEco, June 2008. Available: http://digieco.co.kr/KTFront/report/report_issue_trend_view.action?board_id=strategy&board_seq=756&sort_order=new&list_page=#
  12. B. A. Galitsky and B. Kovalerchuk, "Building a repository of background knowledge using semantic skeletons," in Proceedings of AAAI Spring Symposium 2006 - Formalizing and Compiling Background Knowledge and Its Applications to Knowledge Representation and Question Answering, CA, USA, pp. 22-27, March 27-29, 2006.
  13. A. McCallum, K. Nigam, J. Rennie, and K. Seymore, "A machine learning approach to building domain-specific search engines," in Proceedings of the 16th International Joint Conference on Artificial Intelligence, Vol.2, Stockholm, Sweden, pp. 662-667, July 31 - August 6, 1999.
  14. E. J. Glover, S. Lawrence, W. P. Birmingham, and C. L. Giles, "Architecture of a metasearch engine that supports user information needs," in Proceedings of the 8th International Conference on Information and Knowledge Management, Missouri, USA, pp. 210-216, November 2-6, 1999.
  15. M. Nanoj and J. Elizabeth, "Information retrieval on internet using meta-search engines: a review," Journal of Scientific & Industrial Research, Vol. 67, No. 10, pp. 739-746, October 2008.
  16. R. Shettar and R. Bhuptani, "A vertical search engine - based on domain classifier," International Journal of Computer Science and Security, Vol. 2, No. 4, pp. 18-27, November 2008.
  17. M. Cui and S. Hu, "Search engine optimization research for website promotion," in Proceedings of International Conference on Information Technology, Computer Engineering and Management Sciences, Jiangsu, China, pp. 100-103, September 24-25, 2011.
  18. G. Luo, C. Tang, H. Yang, and X. Wei, "MedSearch: a specialized search engine for medical information retrieval," in Proceedings of the 17th ACM Conference on Information and Knowledge Management, California, USA, pp. 143-152, October 26-30, 2008.
  19. X. Y. Xu and D. Zhao, "Research on the development of vertical search engines," Advances in Future Computer and Control Systems, Vol. 1, pp. 579-584, 2012.
  20. Dogpile.com, University of Pittsburgh, and Pennsylvania State University, Different Engines, Different Results, DOGPILE, 30 pages, April 2007.
  21. F. Yuan and J. Wang, "An implemented rank merging algorithm for meta search engine," in Proceedings of International Conference on Research Challenges in Computer Science, Shanghai, China, pp. 191-193, December 28-29, 2009.
  22. Y. Lu, W. Meng, L. Shu, C. Yu, and K.-L. Liu, "Evaluation of result merging strategies for metasearch engines," Lecture Notes in Computer Science, Vol. 3860, pp. 53-66, 2005.
  23. D. Sheldon, M. Shokouhi, M. Szummer, and N. Craswll, "LambdaMerge: merging the results of query reformulations," in Proceedings of the 4th ACM International Conference on Web Search and Data Mining, Hong Kong, China, pp. 795-804, February 2009.
  24. H. Jadidoleslamy, "Search result merging and ranking strategies in meta-search engines: a survey," International Journal of Computer Science Issues, Vol. 9, No. 3, pp. 239-251, July 2012.
  25. M. Khaled Abd El-Fatah, Merging multiple search results approach for meta-search engines, Doctoral Dissertation, University of Pittsburgh, School of Information Sciences, PA, January 2006.
  26. S. Oh and B. Kim, "Query processing model for internet ontology data change," Journal of Digital Contents Society, Vol. 17, No. 1, pp. 11-22, February 2016. https://doi.org/10.9728/dcs.2016.17.1.11