DOI QR코드

DOI QR Code

Scalable RDFS Reasoning using Logic Programming Approach in a Single Machine

단일머신 환경에서의 논리적 프로그래밍 방식 기반 대용량 RDFS 추론 기법

  • Received : 2014.03.11
  • Accepted : 2014.08.22
  • Published : 2014.10.15

Abstract

As the web of data is increasingly producing large RDFS datasets, it becomes essential in building scalable reasoning engines over large triples. There have been many researches used expensive distributed framework, such as Hadoop, to reason over large RDFS triples. However, in many cases we are required to handle millions of triples. In such cases, it is not necessary to deploy expensive distributed systems because logic program based reasoners in a single machine can produce similar reasoning performances with that of distributed reasoner using Hadoop. In this paper, we propose a scalable RDFS reasoner using logical programming methods in a single machine and compare our empirical results with that of distributed systems. We show that our logic programming based reasoner using a single machine performs as similar as expensive distributed reasoner does up to 200 million RDFS triples. In addition, we designed a meta data structure by decomposing the ontology triples into separate sectors. Instead of loading all the triples into a single model, we selected an appropriate subset of the triples for each ontology reasoning rule. Unification makes it easy to handle conjunctive queries for RDFS schema reasoning, therefore, we have designed and implemented RDFS axioms using logic programming unifications and efficient conjunctive query handling mechanisms. The throughputs of our approach reached to 166K Triples/sec over LUBM1500 with 200 million triples. It is comparable to that of WebPIE, distributed reasoner using Hadoop and Map Reduce, which performs 185K Triples/sec. We show that it is unnecessary to use the distributed system up to 200 million triples and the performance of logic programming based reasoner in a single machine becomes comparable with that of expensive distributed reasoner which employs Hadoop framework.

시맨틱 웹상에서 RDFS로 표현된 데이터의 사용 증가로 인하여, 대용량 데이터의 추론에 대한 많은 요구가 생겨나고 있다. 많은 연구자들은 대용량 온톨로지 추론을 수행하기 위해서 하둡과 같은 고가의 분산 프레임워크를 활용한다. 그러나, 적절한 사이즈의 RDFS 트리플 추론을 위해서는 굳이 고가의 분산 환경 시스템을 사용하지 않고 단일 머신에서도 논리적 프로그래밍을 이용하면 분산 환경과 유사한 추론 성능을 얻을 수 있다. 본 논문에서는 단일 머신에 논리적 프로그래밍 방식을 적용한 대용량 RDFS 추론 기법을 제안하였고 다중 머신을 기반으로 한 분산 환경 시스템과 비교하여 2억개 정도의 트리플에 대한 RDFS 추론 시스템을 적용한 경우 분산환경과 비슷한 성능을 보이는 것을 실험적으로 증명하였다. 효율적인 추론을 위해 온톨로지 모델을 세부적으로 분리한 메타데이터 구조와 대용량 트리플의 색인 방안을 제안하고 이를 위해서 전체 트리플을 하나의 모델로 로딩하는 것이 아니라 각각 온톨로지 추론 규칙에 따라 적절한 트리플 집합을 선택하였다. 또한 논리 프로그래밍이 제공하는 Unification 알고리즘 기반의 트리플 매칭, 검색, Conjunctive 질의어 처리 기반을 활용하는 온톨로지 추론 방식을 제안한다. 제안된 기법이 적용된 추론 엔진을 LUBM1500(트리플 수 2억개) 에 대해서 실험한 결과 166K/sec의 추론 성능을 얻었는데 이는 8개의 노드(8 코아/노드)환경에서 맵-리듀스로 수행한 WebPIE의 185K/sec의 추론 속도와 유사함을 실험적으로 증명하였다. 따라서 단일 머신에서 수행되는 본 연구 결과는 트리플의 수가 2억개 정도까지는 분산환경시스템을 활용하지 않고도 분산환경 시스템과 비교해서 비슷한 성능을 보이는 것을 확인할 수 있었다.

Keywords

Acknowledgement

Supported by : 미래창조과학부

References

  1. Jacopo Urbani, Spyros Kotoulas, Jason Maassen, Frank van Harmele, and Henri Bal, "OWL reasoning with WebPIE: calculating the closure of 100 billion triples," Proc. of the Semantic Web ISWC, Vol. 6088, pp. 213-227, 2010.
  2. Chang Liu, Guilin Qi, Haofen Wang and Yong Yu, "Large Scale Fuzzy pD Reasoning using Map-Reduce," Computational Intelligence Magazine IEEE, Vol. 7, pp. 54-56, May 2012. https://doi.org/10.1109/MCI.2012.2188589
  3. Zoi Kaoudi, Iris Miliaraki, and Manolis Koubarakis, "RDFS Reasoning and Query Answering on Top of DHTs," Proc. of the Semantic Web - ISWC, Vol. 5318, pp. 499-516, 2008.
  4. Martin Peters, Christopher Brink, Sabine Sachweh, and Albert Zundorf, "Rule-based Reasoning on Massively Parallel Hardware," Proc. of CEUR Workshop, Vol. 1046, pp. 33-49, 2013.
  5. Jesse Weaver and James A. Hendler, "Parallel Materialization of the Finite RDFS Closure for Hundreds of Millions of Triples," Proc. of the 8th International Semantic Web Conference, pp. 682-697, 2009.
  6. Norman Heino, and Jeff Z. Pan, "RDFS Reasoning on Massively Parallel Hardware," Proc. of the Semantic Web ISWC, Vol. 7649, pp. 133-148, 2012.
  7. Jesus M. Almendros-Jimenez, "A Prolog-based Query Language for OWL," Proc. of the Tenth Spanish Conference on Programming and Languages, Vol. 271, pp. 3-22, Mar. 2011.
  8. Vangelis Vassiliadis, "A Web Ontology Language - OWL Library for [SWI] Prolog," 2005.
  9. Samuel Lampa, "SWI-Prolog as a Semantic Web Tool for semantic querying in Bioclipse: Integration and performance benchmarking," Master Thesis, Uppsala University, 2010.
  10. Jan Wielemaker, Guus Schreiber, and Bob Wielinga, "Prolog-based Infrastructure for RDF: Scalability and Performance," Proc. of the Semantic Web ISWC, Vol. 2870, pp. 644-658, 2003.
  11. Jan Wielemaker, Guus Schreiber, and Bob Wielinga, "Prolog-based RDF storage and retrieval," Proc. of the Semantic Web ISWC, Vol. 2870, pp. 644-658, 2003.
  12. Jacopo Urbani, "RDFS/OWL reasoning using the Map-Reduce framework," Master Thesis, VU University Amsterdam, 2009.
  13. Jos de Bruijn and Stijn Heymans, "RDF and Logic: Reasoning and Extension," Proc. of 6th International Workshop on Web Semantics WEBS, pp. 460-464, IEEE Computer Society, 2007.
  14. Jacopo Urbani, Spyros Kotoulas, Eyal Oren, and Frank van Harmelen, "Scalable Distributed Reasoning using MapReduce," Proc. of the Semantic Web ISWC, Vol. 5823, pp. 634-649, 2009.
  15. Yuanbo Guo, Zhengxiang Pan, and Jeff Heflin, "LUBM: A Benchmark for OWL Knowledge Base Systems," Journal of Web Semantics: Science, Services and Agents on the World Wide Web, Vol. 3, pp. 158-182, 2005. https://doi.org/10.1016/j.websem.2005.06.005
  16. Saultlux Company, ExoBrain Ontology Data (2014, July 10), [Online]. Available: http://exobrain.etri.re.kr/kspring.jsp

Cited by

  1. Spark based Scalable RDFS Ontology Reasoning over Big Triples with Confidence Values vol.43, pp.1, 2016, https://doi.org/10.5626/JOK.2016.43.1.87