DOI QR코드

DOI QR Code

A Genetic Algorithm for Minimizing Query Processing Time in Distributed Database Design: Total Time Versus Response Time

분산 데이타베이스에서의 질의실행시간 최소화를 위한 유전자알고리즘: 총 시간 대 반응시간

  • 송석규 (영산대학교 호텔경영학과)
  • Published : 2009.06.30

Abstract

Query execution time minimization is an important objective in distributed database design. While total time minimization is an objective for On Line Transaction Processing (OLTP), response time minimization is for Decision Support queries. We formulate the sub-query allocation problem using analytical models and solve with genetic algorithm (GA). We show that query execution plans with total time minimization objective are inefficient from response time perspective and vice versa. The procedure is tested with simulation experiments for queries of up to 20 joins. Comparison with exhaustive enumeration indicates that GA produced optimal solutions in all cases in much less time.

질의실행시간최소화는 분산 데이타베이스 설계에 있어 가장 중요한 목적중의 하나이다. 총시간최소화는 온라인거래처리시스템의 목적인 반면, 반응시간최소화는 의사결정지원 질의시스템의 목적이다. 본 논문에서는 질의실행시간최소화를 달성하기 위해 질의를 세분화하여 최적의 데이터베이스 사이트에 할당하는 분석모델을 개발하였으며, 문제해결방법으로 유전자알고리즘을 채택하였다. 총시간최소화 관점에서의 질의실행 계획은 반응시간최소화 관점의 질의실행계획에는 적합하지 않다는 것을 증명하였으며, 그 반대의 경우도 증명하였다. 최대 20개의 조인이 포함되는 질의를 설계하여 시뮬레이션 실험을 통해 테스트를 수행하였고, 유전자알고리즘과 완전한 전수조사와의 결과를 비교함으로써 모든 경우에 유전자알고리즘을 채택한 해결책이 최적의 결과를 도출하였음을 증명하였다.

Keywords

References

  1. P.M.G Apers, 'Data Allocation in Distributed Database Systems,' ACM Trans. on Database Systems, Vol.13, No.3, pp.263-304, Sep., 1988 https://doi.org/10.1145/44498.45063
  2. J. Arcangeli, A. Hameurlain, E. Migeon and F. Morvan, 'Mobile Agent Based Self-Adaptive Join for Wide-Area Distributed Query Processing,' Journal of Database Management, Vol.15, No.4, pp.25-44, 2004 https://doi.org/10.4018/jdm.2004100102
  3. J. Atkin and M. Norris, Total Area Networking: ATM, Frame Relay and SMDS Explained, John Wiley & Son, New York, N.Y., 1995
  4. F. Baiao, M. Mattoso and G. Zaverucha, 'A Distribution Design Methodology for Object DBMS,' Journal of Distributed and Parallel Databases, Vol.16, No.1, pp.45-90, 2004 https://doi.org/10.1023/B:DAPD.0000026268.04288.b9
  5. B. Bergsten, M. Couprie and P. Valduriez, 'Overview of Parallel Architectures for Database,' The Computer Journal, Vol.36, pp.734-740, Aug., 1993 https://doi.org/10.1093/comjnl/36.8.734
  6. C-H Cheng, W-K Lee and K-F Wong, 'A Genetic Algorithm- Based Clustering Approach for Database Partitioning,' IEEE Transactions on Systems, Man, and Cybernetics, Vol.32, No.3, pp.215-230, 2002 https://doi.org/10.1109/TSMCC.2002.804444
  7. D.W. Cornell and P.S. Yu, 'On Optimal Site Assignment for Relations in the Distributed Database Environment,' IEEE Transactions on Software Engineering, Vol.15, No.8, pp.1004-1009, Aug., 1989 https://doi.org/10.1109/32.31356
  8. J. Cuadrado, Optimize Database Queries, Byte, pp.57-63, July, 1995
  9. L. Davis, Handbook of Genetic Algorithms, Van Nostrand Reinhold, New York, N.Y., 1991
  10. J. Du, R. Alhajj and K. Barker, 'Genetic Algorithms Based Approach to Database Vertical Partitioning,' Journal of Intelligent Information Systems, Vol.26, No.2, pp.167-183, 2006 https://doi.org/10.1007/s10844-006-0242-2
  11. W. Du, M. Shan and U. Dayal, 'Reducing Multidatabase Query Response Time by Tree Balancing,' Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, San Jose, California, pp.293-303, May, 1995 https://doi.org/10.1145/223784.223846
  12. R. Florin and D. Alin, 'Sketches for Size of Join Estimation,' ACM Transactions on Database Systems, Vol.33, No.3, pp.1- 46, 2008 https://doi.org/10.1145/1386118.1386121
  13. O. Frieder and C. Baru, 'Site and Query Scheduling Policies in Multicomputer Database Systems,' IEEE Transactions on Knowledge and Data Engineering, Vol.6, No.4, pp.609-619, Aug., 1994 https://doi.org/10.1109/69.298176
  14. D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley Publishing, 1989
  15. N. Gorla, 'An Object-Oriented Database Design for Improved Performance,' Data and Knowledge Engineering, Vol.37, pp. 117-138, 2001 https://doi.org/10.1016/S0169-023X(01)00004-0
  16. G. Graefe, 'Query Evaluation Techniques for Large Databases,' ACM Computing Surveys, Vol.25, pp.73-90, June, 1993 https://doi.org/10.1145/152610.152611
  17. X. Gu, W. Lin and V. Bharadwaj, 'Practically Realizable Efficient Data Allocation and Replication Strategies for Distributed Databases with Buffer Constraints,' IEEE Transactions on Parallel & Distributed Systems, Vol.17, No.9, pp.1001-1013, Sep., 2006 https://doi.org/10.1109/TPDS.2006.127
  18. J. M. Johansson, S. T. March and J. D. Naumann, 'Modeling Network Latency and Parallel Processing in Distributed Database Design,' Decision Sciences, Vol.34, No.4, pp.677-706, 2003 https://doi.org/10.1111/j.1540-5414.2003.02409.x
  19. D. Kossmann, 'The State of the Art in Distributed Query Processing,' ACM Computing Surveys, Vol.32, No.4, pp.422-469, Dec., 2000 https://doi.org/10.1145/371578.371598
  20. U R. Kulkarni and H. K. Jain, 'Interaction Between Concurrent Transactions in the Design of Distributed Databases,' Decision Sciences, Vol.24, No.2, pp.253-277, 1993 https://doi.org/10.1111/j.1540-5915.1993.tb00474.x
  21. A. Kumar, and R. Pathak, 'Genetic Algorithm Based Approach for File Allocation on Distributed Systems,' Computers & Operations Research, Vol.22, No.1, pp.41-55, 1995 https://doi.org/10.1016/0305-0548(93)E0017-N
  22. B. Li and W. Jiang, 'A novel stochastic optimization algorithm,' IEEE Trans. on Systems, Man, and Cybernetics, Part B, Vol.30, No.1, 2000 https://doi.org/10.1109/3477.826960
  23. S-J. Lim and Y-K Ng, 'Vertical Fragmentation and Allocation in Distributed Deductive Database Systems,' Information Systems, Vol.22, No.1, pp.1-24, 1997 https://doi.org/10.1016/S0306-4379(97)00001-X
  24. S.T. March and S. Rho, 'Allocating Data and Operations to Nodes in Distributed Database Design,' IEEE Trans. on Knowledge and Data Engineering, Vol.7, No.2, April, 1995 https://doi.org/10.1109/69.382299
  25. T. Martin, K. Lam and J. Russel, 'An Evaluation of Site Selection Algorithms for Distributed Query Processing,' The Computer Journal, Vol.33, No.1, pp.61-70, 1990 https://doi.org/10.1093/comjnl/33.1.61
  26. Z. Michalewicz and D. Fogel, How to Solve It: Modern Heuristics, 2nd edition, Springer, Berlin, 2004
  27. M. Ozsu and P. Valduriez, Principles of Distributed Database Systems, Englewood Cliffs, Prentice-Hall Inc., 1991
  28. S. Seshadri and B. Cooper, 'Routing Queries through a Peer-to-Peer InfoBeacons Network Using Information Retrieval Techniques,' IEEE Transactions on Parallel & Distributed Systems, Vol.18, No.12, pp.1754-1765, Dec., 2007 https://doi.org/10.1109/TPDS.2007.1107
  29. S.K. Song and N. Gorla, 'A Genetic Algorithm for Vertical Fragmentation and Access Path Selection,' The Computer Journal, Vol.43, No.1, pp.81-93, 2000 https://doi.org/10.1093/comjnl/43.1.81
  30. J. D. Schaffer, R. A. Caruana, L. J. Eshlman and R. Das, 'A Study of Control Parameters Affecting Online Performance of Genetic Algorithms for Function Optimization, In J. D. Schaffer, (ed.), Proceedings of the Third International Conference on Genetic Algorithms, pp.51-60, 1989
  31. J. Srivastava and G. Elsesser, 'Optimizing Multi-Join Queries in Parallel Relational Databases,' Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems, pp.84-92, 1993
  32. M. Syam, 'Allocating Fragments in Distributed Databases,' IEEE Transactions on Parallel and Distributed Systems, Vol. 16, No.7, pp.577-585, Jul., 2005 https://doi.org/10.1109/TPDS.2005.77
  33. A.M. Tamhankar and S. Ram, 'Database Fragmentation and Allocation: An Integrated Methodology and Case Study,' IEEE Trans. on Systems, Man, and Cybernetics, Vol.28, No.3, pp.288-305, May, 1998 https://doi.org/10.1109/3468.668961
  34. L. The, 'Distributing Data Without Choking the Net,' Datamation, Vol.40, pp.35-36, Jan. 7, 1994
  35. C. T. Yu, C. Chang, M. Templeton, D. Brin and E. Lund, 'Query Processing in a Fragmented Relational Distributed System: Mermaid,' IEEE Transactions on Software Engineering, Vol.11, pp.795-809. Aug., 1985 https://doi.org/10.1109/TSE.1985.232528
  36. M. Ziane, M. Zait and P. Borla-Salamet, 'Parallel Query Processing in DBS 3,' Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems, pp.93-102. 1993