Provenance Compression Scheme Considering RDF Graph Patterns

RDF 그래프 패턴을 고려한 프로버넌스 압축 기법

  • 복경수 (충북대학교 정보통신공학과) ;
  • 한지은 (충북대학교 정보통신공학과) ;
  • 노연우 (충북대학교 정보통신공학과) ;
  • 육미선 (충북대학교 정보통신공학과) ;
  • 임종태 (충북대학교 정보통신공학과) ;
  • 이석희 (동아방송예술대학교 뉴미디어콘텐츠과) ;
  • 유재수 (충북대학교 정보통신공학과)
  • Received : 2015.11.13
  • Accepted : 2015.12.07
  • Published : 2016.02.28


Provenance means the meta data that represents the history or lineage of a data in collaboration storage environments. Therefore, as provenance has been accruing over time, it takes several ten times as large as the original data. The schemes for effciently compressing huge amounts of provenance are required. In this paper, we propose a provenance compression scheme considering the RDF graph patterns. The proposed scheme represents provenance based on a standard PROV model and encodes provenance in numeric data through the text encoding. We compress provenance and RDF data using the graph patterns. Unlike conventional provenance compression techniques, we compress provenance by considering RDF documents on the semantic web. In order to show the superiority of the proposed scheme, we compare it with the existing scheme in terms of compression ratio and the processing time.


RDF;Provenance;Compression;PROV Model


Supported by : 정보통신기술진흥센터, 한국연구재단, 한국에너지기술평가원(KETEP)


  1. T. Berners Lee, J. Hendler, and O. Lassila, "The Sementic Web," In Proceedings of the Scientific American, Vol.284, No.5, pp.34-43, 2001.
  2. Decker, S. Melnik, F. van Harmelen, D. Fensel, M. Klein, J. Broekstra, M. Erdmann, and I. Horrocks, "The Semantic Web: The Roles of XML and RDF," Journal of IEEE : Internet Computing, Vol.4, No.5, pp.63-73, 2000.
  4. 안윤선, 김윤희, "과학 계산 실험을 위한 클라우드 자원을 활용한 실험 프로비넌스 모델 설계," 한국정보과학회 2014 한국컴퓨터종합학술대회 논문집, pp.1548-1550, 2014.
  5. 신은영, 이석훈, 백두권, "An Ontology Provenance Model for an Ontology Repository," 정보과학회논문지: 데이타베이스, 제41권, 제3호, pp.181-191, 2014.
  8. S. Alvarez-Garcia, N. R. Brisaboa, J. D. Fernandez, and M. A. Martinez-Prieto, "Compressed k2-Triples for Full-In-Memory RDF Engines," Association for Information Systems, 2011.
  9. J. D. Fernandez, M. A. Martinez-Prieto, C. Gutierrez, A. Polleres, and M. Arias, "Binary RDF representation for publication and exchange (HDT)," Journal of Web Semantics, Vol.19, pp.22-41, 2013.
  10. N. F. Garcia, J. Arias-Fisteus, L. Sanchez, D. Fuentes-Lorenzo, and O. Corcho, "RDSZ: An Approach for Lossless RDF Stream Compression," European Semantic Web Conference, pp.52-67, 2014.
  11. A. Chapman, H. V. Jagadish, and P. Ramanan, "Efficient provenance storage," Special Interest Group on Management of Data, pp.993-1006, 2008.
  12. H. Halpin and J. Cheney, "Dynamic provenance for SPARQL updates using named graphs," In Workshop on the Theory and Practice of Provenance TaPP-11, 2011.
  13. Y. Xie, K. Muniswamy-Reddy, D. Feng, Y. Li, and D. D. E. Long, "Evaluation of a hybrid approach for efficient provenance storage," TOS, Vol.9, No.4, p.14, 2013.
  14. 김기연, 윤종현, 김천중, 임종태, 복경수, 유재수, "대규모 RDF 데이터의 특성을 고려한 효율적인 색인 기법," 한국콘텐츠학회논문지, 제15권, 제1호, pp.9-23, 2015.
  15. 고훈준, 유원희, "응용프로그램의 검색을 위한 RDF 메타데이터 시스템의 설계," 한국콘텐츠학회논문지, 제5권, 제6호, pp.1-9, 2005.