DOI QR코드

DOI QR Code

Optimized Entity Attribute Value Model: A Search Efficient Re-presentation of High Dimensional and Sparse Data

  • Paul, Razan (Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology) ;
  • Latiful Hoque, Abu Sayed Md. (Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology)
  • Received : 2011.06.30
  • Accepted : 2011.07.06
  • Published : 2011.09.30

Abstract

Entity Attribute Value (EAV) is the widely used solution to represent high dimensional and sparse data, but EAV is not search efficient for knowledge extraction. In this paper, we have proposed a search efficient data model: Optimized Entity Attribute Value (OEAV) for physical representation of high dimensional and sparse data as an alternative of widely used EAV. We have implemented both EAV and OEAV models in a data warehousing en-vironment and performed different relational and warehouse queries on both the models. The experimental results show that OEAV is dramatically search efficient and occupy less storage space compared to EAV.

References

  1. Stead, W.W., Hammond, W.E., and Straube, M.J. (1983). A chartless record--is it adequate? J Med Syst 7, 103-109. https://doi.org/10.1007/BF00995117
  2. Thomas, E.J., Jeffrey, T.W., and Dubbels Joel, C. (2007). A health-care data model based on the HL7 reference information model. IBM Systems Journal 46, 5-18. https://doi.org/10.1147/sj.461.0005
  3. Li, J.L., Li, M.X., Deng, H.Y., Duffy, P.E., and Deng, H.W. (2005). PhD: a web database application for phenotype data management. Bioinformatics 21, 3443-3444. https://doi.org/10.1093/bioinformatics/bti557
  4. Anhoj, J. (2003). Generic design of Web-based clinical databases. J Med Internet Res 5, e27. https://doi.org/10.2196/jmir.5.4.e27
  5. Brandt, C.A., Deshpande, A.M., Lu, C., Ananth, G., Sun, K., Gadagkar, R., Morse, R., Rodriguez, C., Miller, P.L., and Nadkarni, P.M. (2003). TrialDB: A web-based Clinical Study Data Management System. AMIA Annu Symp Proc, 794.
  6. Nadkarni, P.M., Brandt, C., Frawley, S., Sayward, F.G., Einbinder, R., Zelterman, D., Schacter, L., and Miller, P.L. (1998). Managing attribute--value clinical trials data using the ACT/DB clientserver database system. J Am Med Inform Assoc 5, 139-151. https://doi.org/10.1136/jamia.1998.0050139
  7. Nadkarni, P. http://ycmi.med.yale.edu/nadkarni/Introduction%20-to%20EAV%20systems.htm. Yale University School of Medicine. [Online].
  8. Pin-Shan Peter, C. (1976). The entity-relationship model-toward a unified view of data. ACM Transactions on Database Systems 1, 9-36. https://doi.org/10.1145/320434.320440
  9. Hoque., A.S.M.L. (2002). Storage and querying of high dimensional sparsely populated data in compressed representation. Lecture Notes on Computer Science, 2510, 418-425. https://doi.org/10.1007/3-540-36087-5_49
  10. Agarwal, S., Agrawal, R., Deshpande, P., Gupta, A., Naughton, J.F., Ramakrishnan, R., and Sarawagi, S. (1996). On the Computation of Multidimensional Aggregates. In Very Large Data Bases 506-521.
  11. Adam, B., Jim, Gray., Andrew, Layman., Hamid, Pirahesh. (1997). Data cube: a relational aggregation operator generalizing group-by, cross-tab and sub-totals. Data Mining and Knowledge Discovery 1, 29-53. https://doi.org/10.1023/A:1009726021843