DOI QR코드

DOI QR Code

MLPPI Wizard: An Automated Multi-level Partitioning Tool on Analytical Workloads

  • Received : 2017.07.19
  • Accepted : 2017.11.08
  • Published : 2018.04.30

Abstract

An important technique used by database administrators (DBAs) is to improve performance in decision-support workloads associated with a Star schema is multi-level partitioning. Queries will then benefit from performance improvements via partition elimination, due to constraints on queries expressed on the dimension tables. As the task of multi-level partitioning can be overwhelming for a DBA we are proposing a wizard that facilitates the task by calculating a partitioning scheme for a particular workload. The system resides completely on a client and interacts with the costing estimation subsystem of the query optimizer via an API over the network, thereby eliminating any need to make changes to the optimizer. In addition, since only cost estimates are needed the wizard overhead is very low. By using a greedy algorithm for search space enumeration over the query predicates in the workload the wizard is efficient with worst-case polynomial complexity. The technology proposed can be applied to any clustering or partitioning scheme in any database management system that provides an interface to the query optimizer. Applied to the Teradata database the technology provides recommendations that outperform a human expert's solution as measured by the total execution time of the workload. We also demonstrate the scalability of our approach when the fact table (and workload) size increases.

Keywords

References

  1. Sinclair, P., "Using PPIs to Improve Performance," August 31, 2015.
  2. Klindt, J., "Single-level and Multilevel Partitioned Primary Indexes," July 15, 2015. .
  3. O'Neil, P., O'Neil, B., and Chen, X., "The Star Schema Benchmark (SSB)," July 15, 2017.
  4. Agrawal, S., Narasayya, V., and Yang, B., "Integrating Vertical and Horizontal Partitioning into Automated Physical Database Design," in Proc. of SIGMOD, pp. 359-370, 2004.
  5. Lightstone, S.S. and Bhattacharjee, B., "Automated Design of Multi-dimensional Clustering Tables for Relational Databases," in Proc. of VLDB, pp. 1170-1181, 2004.
  6. Nehme, R. and Bruno, N., "Automated Partitioning Design in Parallel Database Systems," in Proc. of SIGMOD, pp. 1137-1148, 2011.
  7. Suh, Y.K., Ghazal, A., Crolotte, A., and Kostamaa, P., "A New Tool for Multi-level Partitioning in Teradata," in Proc. of CIKM, pp. 2214-2218, 2012.
  8. Chaudhuri, S. and Narasayya, V., "AutoAdmin 'What-If' Index Analysis Utility," SIGMOD Record, vol. 27, no. 2, pp. 367-378, 1998. https://doi.org/10.1145/276305.276337
  9. Finkelstein, S., Schkolnick, M., and Tiberio, P., "Physical Database Design for Relational Databases," ACM Trans. on Databas. Syst. vol. 13, no. 1, pp. 91-128, 1988. https://doi.org/10.1145/42201.42205
  10. Labio,W., Quass, D., and Adelberg, B., "Physical Database Design for Data Warehouses," in Proc. of ICDE, pp. 277-288, 1997.
  11. Rozen, S. and Shasha, D., "A Framework for Automating Physical Database Design," in Proc. of VLDB, pp. 401-411, 1991.
  12. Oracle Corp., "Partitioning Advisor," February 2, 2017. .
  13. Oracle Corp., "SQL Access Advisor," March 2012. .
  14. Rao, J., Zhang, C., Megiddo, N., and Lohman, G., "Automating Physical Database Design in a Parallel Database," in Proc. of SIGMOD, pp. 558-569, 2002.
  15. Agrawal, S., Chaudhuri, S., Kollar, L., Marathe, A., Narasayya, and V., Syamala, M., "Database Tuning Advisor for Microsoft SQL Server 2005," in Proc. of SIGMOD, pp. 930-932, 2005.
  16. Agrawal, S., Chaudhuri, S., and Narasayya, V.R., "Automated Selection of Materialized Views and Indexes in SQL Databases," in Proc. of VLDB, pp. 496-505, 2000.
  17. Dash, D., Polyzotis, N., and Ailamaki, A., "CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads," in Proc. of PVLDB, vol. 4, no. 6, pp. 362-372, 2011.
  18. Kimura, H., Narasayya, and V., Syamala, M., "Compression Aware Physical Database Design," in Proc. of PVLDB, vol. 4, no. 10, pp. 657-668, 2011.
  19. Microsoft Corp. "Microsoft SQL Server 2000: Index Tuning Wizard SQL Server 2000," (viewed on March 1, 2012).
  20. Valentin, G., Zuliani, M., Zilio, D.C., Lohman, G., and Skelley, A., "DB2 Advisor: An Optimizer Smart Enough to Recommend Its Own Indexes," in ICDE, pp. 101-110, 2000.
  21. Zilio, D.C., Zuzarte, C., Lohman, G.M., Pirahesh, H., Gryz, J., Alton, E., Liang, D., and Valentin, G., "Recommending Materialized Views and Indexes with the IBM DB2 Design Advisor," in Proc. of the Int'l Conf. on Autonomic Computing, pp. 180-188, 2004.
  22. Zilio, D.C., Rao, J., Lightstone, S., Lohman, G., Storm, A., Garcia-Arellano, C., and Fadden, S. "DB2 Design Advisor: Integrated Automatic Physical Database," in Proc. of VLDB, pp. 1087-1097, 2004.
  23. Tatarowicz, A.L., Curino, C., Jones, E.P.C., and Madden, S., "Lookup Tables: Fine-grained Partitioning for Distributed Databases," in Proc. of ICDE, pp. 102-113, 2012.
  24. Idreos, S., Kersten, M.L., and Manegold, S., "Database Cracking," in Proc. of CIDR, pp. 68-78, 2007.
  25. Wang, J., Hang, S., Liu, J., Chen, W., and Hou, G., "Multi-level Scheduling Algorithm Based on Storm," in Proc. of TIIS, vol. 10, no.3, 2016.
  26. Yuan, Y., Chen, W., Han, G., Jia, G., "OLAP4R: A Top-K Recommendation System for OLAP Sessions," in Proc. of TIIS, vol. 11, no.6, 2017.
  27. Zahra Pooranian, Mohammad Shojafar, Bahman Javadi, Ajith Abraham, "Using Imperialist Competition algorithm for Independent Task Scheduling in Grid Computing," Journal of Intelligent and Fuzzy Systems, vol. 27, no. 1, pp. 187-199, 2014.
  28. Zahra Pooranian, Mohammad Shojafar, Reza Tavoli, Mukesh Singhal, Ajith Abraham, "A Hybrid Metaheuristic Algorithm for Job Scheduling on Computational Grids," Informatica (Slovenia) vol. 37, no. 2, pp.157-164, 2013.
  29. TPC-H, September 29, 2017.