DOI QR코드

DOI QR Code

Efficient Query Retrieval from Social Data in Neo4j using LIndex

  • Received : 2017.04.08
  • Accepted : 2018.01.13
  • Published : 2018.05.31

Abstract

The unstructured and semi-structured big data in social network poses new challenges in query retrieval. This requirement needs to be met by introducing quality retrieval time measures like indexing. Due to the huge volume of data storage, there originate the need for efficient index algorithms to promote query processing. However, conventional algorithms fail to index the huge amount of frequently obtained information in real time and fall short of providing scalable indexing service. In this paper, a new LIndex algorithm, which is a heuristic on Lucene is built on Neo4jHA architecture that holds the social network Big data. LIndex is a flexible and simplified adaptive indexing scheme that ascendancy decomposed shortest paths around term neighbors as basic indexing unit. This newfangled index proves to be effectual in query space pruning of graph database Neo4j, scalable in index construction and deployment. A graph query is processed and optimized beyond the traditional Lucene in a time-based manner to a more efficient path method in LIndex. This advanced algorithm significantly reduces query fetch without compromising the quality of results in time. The experiments are conducted to confirm the efficiency of the proposed query retrieval in Neo4j graph NoSQL database.

Keywords

References

  1. J. Webber, "A programmatic introduction to Neo4j," in Proc. of the 3rd annual conference on Systems, programming, and applications: software for humanity - SPLASH '12, pp. 217-218, 2012.
  2. Ferreira D. R. G., "Using Neo4J geospatial data storage and integration," Dissertation, University of Madeira, 2014.
  3. Rabuzin, Kornelije, "Deductive Graph Database-Datalog in Action," in Proc. of IEEE Int. Conf. on Computational Science and Computational Intelligence (CSCI), pp. 114-118, 2015.
  4. Mathew Anita Brigit, "Comparison of Search Techniques in SocialGraph Neo4j", in Proc. of 3rdInt. Symposiumon BigData and Cloud Computing Challenges (ISBCC-16), pp. 293-305, 2016.
  5. Mathew Anita Brigit and Kumar SD Madhu, "Novel research frame work on SN"s NoSQL databases for efficient query processing," International Journal of Reasoning-based Intelligent Systems, vol. 7, no.3, pp. 330-338, 2015. https://doi.org/10.1504/IJRIS.2015.072959
  6. Mathew Anita Brigit and Madhu Kumar S D, "Analysis of data management and query handling in social networks using NoSQL databases," in Proc. of Int. Conf. on Advances in Computing, Communications and Informatics (ICACCI), pp. 800-806, May 2015.
  7. Yaroslav, K., VladimirTarasenko, Julia Boyarinova, and YakovKalynovskiy. "Vector Functionally-Oriented Processors with Vertical Parallelism for Operations on Quaternions," Journal of Qafqaz University, vol. 1, no. 2, pp.83-90, 2013.
  8. Liu, Lu, and Tao Peng. "Post-processing of deep web information extraction based on domain ontology," Advances in Electrical and Computer Engineering, vol. 13, no. 4, pp. 25-32, 2013. https://doi.org/10.4316/AECE.2013.04005
  9. Alkire, Sabina, "The missing dimensions of poverty data: Introduction to the special issue," Oxford development studies, vol. 35, no. 4, pp. 347-359, 2013. https://doi.org/10.1080/13600810701701863
  10. Yi, Xiaomeng and Liu, Fangming and Liu, Jiangchuan and Jin, Hai, "Building a network highway for big data: architecture and challenges," IEEE Network, vol. 28, no.4, pp. 5-13, 2014. https://doi.org/10.1109/MNET.2014.6863125
  11. Kim, Kyoungsook, MoonsukYeon, ByeongsooJeong, and Kwanghoon Kim, "A ConceptualApproach for Discovering Proportions of Disjunctive Routing Patterns in a Business ProcessModel," KSII Transactions on Internet & Information Systems, vol.11, no. 2, 2017.
  12. Atkinson, Anthony B, "Multidimensional deprivation: contrasting social welfare and counting approaches" The Journal of Economic Inequality, vol.1, no. 1, pp. 51-65, 2013.
  13. Batrinca, Bogdan and Treleaven, Philip C, "The G graph database: efficiently managing large distributed dynamic graphs," Distributed and Parallel Databases, vol. 33, no. 4, pp. 479-514, 2015. https://doi.org/10.1007/s10619-014-7140-3
  14. Khan, Arijit and Li, Nan and Yan, Xifeng and Guan, Ziyu and Chakraborty, Supriyo and Tao, Shu, "Neighborhood based fast graph search in large networks," in Proc. of Int. Conf. on Management of data, ACM SIGMOD, pp. 901-912, 2011.
  15. Martinez-Bazan, Norbert and Dominguez-Sal, David, "Using semijoin programs to solve traversal queries in graph databases," in Proc. of Workshop on Graph Data management Experiences and Systems, ACM, pp. 1-6, 2014.
  16. Li, Hongwei, Yi Yang, Mi Wen, HongweiLuo, and Rongxing Lu. "EMRQ: An Efficient Multikeyword Range Query Scheme in Smart Grid Auction Market," TIIS, vol. 8, no. 11, pp.3937-3954, 2014.
  17. Otte, Evelien and Rousseau, Ronald, "Social network analysis: a powerful strategy, also for the information sciences," Journal of information Science, vol. 28, no. 6, pp. 441-453, 2002. https://doi.org/10.1177/016555150202800601
  18. Batrinca, Bogdan and Treleaven, Philip C, "Social media analytics: a survey of techniques, tools and platforms," AI & SOCIETY, vol.30, no.1, pp. 89-116, 2015. https://doi.org/10.1007/s00146-014-0549-4
  19. Morris, Meredith Ringel and Teevan, Jaime and Panovich, Katrina, "What do people ask their social networks, and why?: a survey study of status message q&a behavior," in Proc. of the SIGCHI conf. on Human Factors in Computing Systems, 42, ACM, (1), 2010, pp. 1739-1748.
  20. Mathew Anita Brigit and Pattnaik, Priyabrat and Madhu Kumar S D, "Efficient information retrieval using Lucene, LIndex and HIndexinHadoop," in Proc. of 11th Int. Conf. on Computer Systems and Applications (AICCSA), pp. 333-340, Nov. 2015.
  21. Li, Hongwei, Yi Yang, Mi Wen, HongweiLuo, and Rongxing Lu, "EMRQ: An Efficient Multikeyword Range Query Scheme in Smart Grid Auction Market," TIIS, vol. 8, no. 11, pp.3937-3954, 2014
  22. Selim, Haysam and Zhan, Justin, "Towards shortest path identification on large networks", Journal of Big data, vol.3, no.1, pp. 1-10, 2016.
  23. Patino Mart?nez, Marta and Sancho, Diego and JimenezPeris, RicardoandBrondino, Ivan and Vianello, Valerio and Dhamane, Rohit, "Snap shot isolation for Neo4j," OpenProceedings.org, 2016.
  24. Shojafar M., Abawajy J. H., Delkhah, Z. et al., "An efficient and distributed file search in unstructured peer-to-peer networks," Peer to-Peer Networking and Applications, vol. 8, no.1, pp. 120-136, 2015. https://doi.org/10.1007/s12083-013-0236-0
  25. Sakr, Sherif and Liu, Anna and Batista, Daniel M and Alomari, Mohammad, "A survey of large scale data management approaches in cloud environments," IEEE Communications Surveys & Tutorials, vol.13, no.3, pp. 311-336, 2011. https://doi.org/10.1109/SURV.2011.032211.00087
  26. Mathew Anita Brigit and Madhu Kumar S D, "An Efficient Index based Query handling model for Neo4j," International Journal of Advances in Computer Science and Technology, vol. 3, no. 2, pp. 12-18, 2014.
  27. CiroCattuto, Marco Quaggiotto, Andre Panisson, Alex Averbuch, "Time-varying social networks in a graph database: a Neo4j use case," in Proc. of 1st Int. Workshop on Graph Data Management Experiences and Systems, June 23-23, pp.1-6, New York, 2013.
  28. Mussarat, Yasmin, Sharif Muhammad, MohsinSajjad, and IrumIsma. "Content based image retrieval using combined features of shape, color and relevance feedback." KSII Transactions on internet and information systems, vol.7, no. 12, pp.3149-3165, 2013. https://doi.org/10.3837/tiis.2013.12.011
  29. Wan, Jiafu, Hehua Yan, HuiSuo, and Fang Li. "Advances in Cyber-Physical Systems Research," TIIS, vol.5, no. 11, pp.1891-1908, 2011.