JOURNAL BROWSE
Search
Advanced SearchSearch Tips
SPARQL Query Processing System over Scalable Triple Data using SparkSQL Framework
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
  • Journal title : Journal of KIISE
  • Volume 43, Issue 4,  2016, pp.450-459
  • Publisher : Korean Institute of Information Scientists and Engineers
  • DOI : 10.5626/JOK.2016.43.4.450
 Title & Authors
SPARQL Query Processing System over Scalable Triple Data using SparkSQL Framework
Jeon, MyungJoong; Hong, JinYoung; Park, YoungTack;
  PDF(new window)
 Abstract
Every year, RDFS data tends further toward scalability; hence, the manner of SPARQL processing needs to be changed for fast query. The query processing method of SPARQL has been studied using a scalable distributed processing framework. Current studies indicate that the query engine based on the scalable distributed processing framework i.e., Hadoop(MapReduce) is not suitable for real-time processing because of the repetitive tasks; in addition, it is difficult to construct a query engine based on an In-memory Distributed Query engine, because distributed structure on the low-level is required to be considered. In this paper, we proposed a method to construct a query engine for improving the speed of the query process with the mass triple data. The query engine processes the query of SPARQL using the SparkSQL, which is an In-memory based, distributed query processing framework. SparkSQL is a high-level distributed query engine that facilitates existing SQL statement. In order to process the SPARQL query, after generating the Algebra Tree using Jena, the Algebra Tree is required to be translated to Spark Algebra Tree for application in the Spark system, and construction of the system that generated the SparkSQL query. Furthermore, we proposed the design of triple property table based on DataFrame for more efficient query processing in the Spark system. Finally, we verified the validity through comparative evaluation with the query engine, which is the existing distributed processing framework.
 Keywords
in-memory based distributed query engine;RDFS;SPARQL;spark;SparkSQL;sempala;
 Language
Korean
 Cited by
 References
1.
S. Alexander, P. Z. Martin, L. Georg, "PigSPARQL: Mapping SPARQL to Pig Latin," SWIM '11 Proceedings of the International Workshop on Semantic Web Information Management, Jun. 2011.

2.
J. Batselem, Wangon Lee, KangPil Kim, Young Tack Park, "SPARQL Query Processing in Distributed In-Memory System," Vol. 42, No.9, pp.1109-1116, Sep. 2015. crossref(new window)

3.
Xi Chen, Huajun Chen, Ningyu Zhang, Songyang Zhang, "SparkRDF: Elastic Discreted RDF Graph Processing Engine With Distributed Memory," ISWC '14, Vol. 9098, pp. 261-264, Oct. 2014.

4.
Alexander Schatzle, Martin Przyjaciel-Zablocki, Antony Neu, Georg Lausen, "Sempala: Interactive SPARQL Query Processing on Hadoop," ISWC '14, Vol. 8796, pp.164-179, Oct. 2014.

5.
J. Barrasa and A. Gomez-Perez, "Upgrading relational legacy data to the semantic web," 15th International Conference on World Wide Web, pp.1069-1070. ACM, 2006.

6.
C. Bizer and R. Cyganiak, "D2R Server: Publishing relational databases on the web," The 5th International Semantic Web Conference, 2006.

7.
S. Auer, S. Dietzold, J. Lehmann, S. Hellmann, and D. Aumueller, "Triplify: lightweight Linked Data publication from relational databases (2009)," 18th International Conference on World Wide Web, pp.621-630, 2009.

8.
A. Chebotko, S. Lu, and F. Fotouhi, "Semantics preserving SPARQL-to-SQL translation. Data & Know ledge Engineering," Vol. 68, No. 10, pp.973-1000, Apr. 2009. crossref(new window)

9.
A. Garrote and M. Garcia, "Restful writable APIs for the web of linked data using relational storage solutions," WWW 2011 Workshop: Linked Data on the Web (LDOW2011), 2011.

10.
Freddy Priyatna, Oscar Corcho, Ju an Sequeda, "Formalisation and Experiences of R2RML-based SPARQL to SQL query translation using Morph," Proc. of the 23rd IW3C2, pp.479-490, Apr. 2014.

11.
J. Unbehauen, C. Stadler, and S. Auer, "Accessing relational data on the web with sparqlmap," JIST, Vol. 7774, pp. 65-80. Springer, Dec. 2012.

12.
C. Artem, L. Shiyoung, Hasan M, Jamil, F. Farshad, "Semantics Preserving SPARQL-to-SQL Query Translation for Optional Graph Patterns," Vol. 68, pp. 973-1000, Oct. 2009. crossref(new window)

13.
M. A. Bornea, J. Dolby, A. Kementsietsidis, K. Srinivas, P. Dantressangle, O. Udrea, and B. Bhattacharjee, "Building an efficient rdf store over a relational database," Proc. of the 2013 international conference on Management of data. ACM, pp.121-132, 2013.

14.
Kevin Wilkinson, "Jena Property Table Implementation," SSWS, pp.35-46, 2006.

15.
Michael Armbrust, Reynold S. Xin, Cheng Lian, et. aI., "Spark SQL: Relational Data Processing in Spark," SIGMOD'15, pp. 1383-1394, May 2015.

16.
[Online]. Available: https//amplab.cs.berkeley.edu/benchmark/