DOI QR코드

DOI QR Code

Performance Evaluation of SSD-Index Maintenance Schemes in IR Applications

  • Jin, Du-Seok (Department of Information Technology Research, Korea Institute of Science and Technology Information) ;
  • Jung, Hoe-Kyung (Department of Computer Engineering, Paichai University)
  • Received : 2010.07.01
  • Accepted : 2010.07.01
  • Published : 2010.08.31

Abstract

With the advent of flash memory based new storage device (SSD), there is considerable interest within the computer industry in using flash memory based storage devices for many different types of application. The dynamic index structure of large text collections has been a primary issue in the Information Retrieval Applications among them. Previous studies have proven the three approaches to be effective: In- Place, merge-based index structure and a combination of both. The above-mentioned strategies have been researched with the traditional storage device (HDD) which has a constraint on how keep the contiguity of dynamic data. However, in case of the new storage device, we don' have any constraint contiguity problems due to its low access latency time. But, although the new storage device has superiority such as low access latency and improved I/O throughput speeds, it is still not well suited for traditional dynamic index structures because of the poor random write throughput in practical systems. Therefore, using the experimental performance evaluation of various index maintenance schemes on the new storage device, we propose an efficient index structure for new storage device that improves significantly the index maintenance speed without degradation of query performance.

References

  1. R. Baeza-Yates, B. Ribeiro-Neto, Modern Information Retrieval. Addison-Wesley, Reading, MA, 1999.
  2. H. Witten, A. Moffat, C. Bell, Managing gigabytes: compressing and indexing documents and images. Morgan Kaufmann, Los Altos, CA 94022, USA, second edition, 1999.
  3. N. Lester, J. Zobel, and H. Williams, "In-place verse re-build verse re-merge: Index maintenance strategies for text retrieval systems," Proc. CRPIT 27th Australasian Computer Science Conference pp. 15-23, 2004.
  4. E. Brown, J. Callan, and W. Croft, "Fast incremental indexing for full-text information retrieval," Proc. VLDB, pp. 192-202, Sep. 1994.
  5. H. Tomasic, Garcia-Molina, and K. Shoens, "Incremental updates of inverted lists for text documents retrieval," Proc. ACM SIGMOD, pp. 289-300, May. 1994.
  6. Andrew Birrell, Michael Isard, Chuck Thacker, and Ted Wobbe, "A Design for High-Performance Flash Disks," Technical Report MSR-TR-2005-176, Microsoft Research, Dec. 2005.
  7. Sang-Won Lee, and Bongki Moon, "Design of Flash-Based DBMS: An In-Page Logging Approach," Proc. ACM SIGMOD, pp. 55-66, Jun. 2007.
  8. C. H. Wu, L. P. Chang, and T. W. Kuo, "An Efficient B-Tree Layer for Flash-Memory Storage Systems," Proc. RTCSA, pp. 409-430, 2004.
  9. S Nath, and A Kansal, "FlashDB: Dynamic Self-tuning Database for NAND Flash," Proc. IPSN 6th Information Processing in Sensor Networks Conference, Apr. 2007.
  10. N. Lester, A. Moffat, and J. Zobel, "Fast On-Line Index Construction by Geometric Partitioning," Proc. ACM CIKM, pp. 776-783, 2005.
  11. S. Buttcher, C.L.A. Clarke, and B. Lushman, "Hybrid index maintenance for growing text collections," Proc. ACM SIGMOD, pp. 1-4, 2004.
  12. N. Lester, A. Moffat, and J. Zobel, "Efficient Online Index Construction for Text Database," " ACM Trans. Database Systems, vol. 33, no. 3, article 19, Aug. 2008.