Advanced SearchSearch Tips
Predicting Stock Liquidity by Using Ensemble Data Mining Methods
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Predicting Stock Liquidity by Using Ensemble Data Mining Methods
Bae, Eun Chan; Lee, Kun Chang;
  PDF(new window)
In finance literature, stock liquidity showing how stocks can be cashed out in the market has received rich attentions from both academicians and practitioners. The reasons are plenty. First, it is known that stock liquidity affects significantly asset pricing. Second, macroeconomic announcements influence liquidity in the stock market. Therefore, stock liquidity itself affects investors` decision and managers` decision as well. Though there exist a great deal of literature about stock liquidity in finance literature, it is quite clear that there are no studies attempting to investigate the stock liquidity issue as one of decision making problems. In finance literature, most of stock liquidity studies had dealt with limited views such as how much it influences stock price, which variables are associated with describing the stock liquidity significantly, etc. However, this paper posits that stock liquidity issue may become a serious decision-making problem, and then be handled by using data mining techniques to estimate its future extent with statistical validity. In this sense, we collected financial data set from a number of manufacturing companies listed in KRX (Korea Exchange) during the period of 2010 to 2013. The reason why we selected dataset from 2010 was to avoid the after-shocks of financial crisis that occurred in 2008. We used Fn-GuidPro system to gather total 5,700 financial data set. Stock liquidity measure was computed by the procedures proposed by Amihud (2002) which is known to show best metrics for showing relationship with daily return. We applied five data mining techniques (or classifiers) such as Bayesian network, support vector machine (SVM), decision tree, neural network, and ensemble method. Bayesian networks include GBN (General Bayesian Network), NBN (Naive BN), TAN (Tree Augmented NBN). Decision tree uses CART and C4.5. Regression result was used as a benchmarking performance. Ensemble method uses two types-integration of two classifiers, and three classifiers. Ensemble method is based on voting for the sake of integrating classifiers. Among the single classifiers, CART showed best performance with 48.2%, compared with 37.18% by regression. Among the ensemble methods, the result from integrating TAN, CART, and SVM was best with 49.25%. Through the additional analysis in individual industries, those relatively stabilized industries like electronic appliances, wholesale & retailing, woods, leather-bags-shoes showed better performance over 50%.
Stock liquidity;Data-mining;Ensemble methods;decision making;
 Cited by
H. C. Lee, "The Relation between Asset Liquidity and Stock Liquidity," Korean Journal of Business Administration, Vol. 27, No. 10, pp. 1691-1710, 2014.

Korea Capital Market Institute, "Outlook for Korea's stock and bond markets," Seoul, S. W. Hwang and S. H. Kang, 2015.

K. Mazouz, W. Daya and S. Yin, "Index revisions, systematic liquidity risk and the cost of equity capital," Journal of International Financial Markets, Institutions and Money, Vol. 33, pp. 283-298, 2014. crossref(new window)

M. L. Lipson and M. Sandra, "Liquidity and capital structure," Journal of Financial Markets, Vol. 12, No. 4, pp. 611-644, 2009. crossref(new window)

H. J. Ko, Y. S. Park and H. S. Lee, "The Empirical Analysis on the Relation between Volatility of Liquidity and Return," Korean Journal of Business Administration, Vol. 22, No. 5, pp. 2873-2893, 2009.

Y. Amihud, and H. Mendelson, "Liquidity and stock returns," Financial Analysts Journal, Vol. 42, No. 3, pp. 43-48, 1986.

A. S. Turnbull, R. W. White and B. F. Smith, "In search of liquidity: The block broker's choice of where to trade cross-listed stocks," Journal of Economics and Business, Vol. 62 No. 1, pp. 20-34, 2010. crossref(new window)

L. Kryzanowski and S. Lazrak, "Liquidity minimization and cross-listing choice: Evidence based on Canadian shares cross-listed on U.S. venues," Journal of International Financial Markets, Institutions and Money, Vol. 19, No. 3, pp. 550-564, 2009. crossref(new window)

R. Gopalan, O. Kadan and M. Pevzner, "Asset liquidity and stock liquidity," Journal of Financial and Quantitative Analysis, Vol. 47, No. 2, pp. 333-364, 2012. crossref(new window)

K. S. Cho, H. C. Shin, "A Study on the Effects of Block Ownership on Trading Activity and Market Liquidity in Korean Stock Market," Korean Journal of Business Administration, Vol. 26, No. 1, pp. 131-148, 2013.

J. Pearl, "Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference," Morgan Kaufmann, 1988.

B. Yet, K. Bastani, H. Raharjo, S. Lifvergren, W. Marsh and B. Bergman, "Decision support system for Warfarin therapy management using Bayesian networks," Decision Support Systems, Vol. 55, No. 2, pp. 488-498, 2013. crossref(new window)

Y. Zuo and E. Kita, "Stock price forecast using Bayesian network," Expert Systems with Applications, Vol. 39, No. 8, pp. 6729-6737, 2012. crossref(new window)

F. Zheng, G. I. Webb, P. Suraweera and L. Zhu, "Subsumption resolution: an efficient and effective technique for semi-naive Bayesian learning," Machine Learning, Vol. 87 No. 1, pp. 93-125, 2012. crossref(new window)

G. I. Webb, J. R. Boughton, F. Zheng and K. M. Ting, "Learning by extrapolation from marginal to full-multivariate probability distributions: decreasingly naive Bayesian classification," Machine Learning, Vol. 86, No. 2, pp. 233-272, 2012. crossref(new window)

B. Park and J. K. Bae, "Using machine learning algorithms for housing price prediction: The case of Fairfax County, Virginia housing data," Expert Systems with Applications, Vol. 42, No. 6, pp. 2928-2934, 2015. crossref(new window)

L. Bouchaala, A. Masmoudi., F. Gargouri. and A. Rebai, "Improving algorithms for structure learning in Bayesian Networks using a new implicit score," Expert System Application, Vol. 37, No. 7, pp. 5470-5475, 2010. crossref(new window)

R. O. Duda, P. E. Hart. and D. G. Stork, "Pattern classification," Journal of Classification, Vol. 24, No. 2, pp. 305-307, 2007. crossref(new window)

J. Quinlan, "C4.5: Programs for Machine Learning," Morgan Kaufman, 1993.

S. Lee, "Using data envelopment analysis and decision trees for efficiency analysis and recommendation of B2C controls," Decision Support Systems, Vol. 49, No. 4, pp. 486-497, 2013.

L. Rutkowski, M. Jaworski, L. Pietruczuk and P. Duda, "The CART decision tree for mining data streams," Information Sciences, Vol. 266, No. 10, pp. 1-15, 2014. crossref(new window)

Y. Lin, H. Guo. and J. Hu, "An SVM-based Approach for Stock Market Trend Prediction," Proceedings of International Joint Conference on Neural Networks, pp. 1-7, 2013.

J. A. Suykens and J. Vandewalle, "Least squares support vector machine classifiers," Neural processing letters, Vol. 9, No. 3, pp. 293-300, 1999. crossref(new window)

L. Zhou, K. K. Lai and L. Yu, "Least squares support vector machines ensemble models for credit scoring," Expert Systems with Applications, Vol. 37, No. 1, pp. 127-133, 2010. crossref(new window)

M. T. Hagan, H. B. Demuth and M. H. Beale, "Neural network design", Boston: Pws Pub, 1996.

H. C. W. Lau, G. T. S. Ho and Y. Zhao, "A demand forecast model using a combination of surrogate data analysis and optimal neural network approach," Decision Support Systems, Vol. 54, No. 3, pp. 1404-1416, 2013. crossref(new window)

P. Hajek, "Municipal credit rating modelling by neural networks," Decision Support Systems, Vol. 51, No. 1, pp. 108-118, 2011. crossref(new window)

T. G. Dietterich, "Ensemble learning," The handbook of brain theory and neural networks, Vol. 2, pp. 110-125, 2002.

K. C. Lee and K. Choi, "A study on the classification properties of firms to be subject to accounting disclosure reviews and investigations: Comparison of Bayesian Network, C5.0, and ensemble prediction methods," Korean Management Review, Vol. 36, No. 3, pp. 705-737, 2007.

L. I. Kuncheva and J. J. Rodriguez, "Classifier ensembles for fMRI data analysis: an experiment," Magnetic Resonance Imaging, Vol. 28, No. 4, pp. 583-593, 2010. crossref(new window)

E. Fersini, E. Messina and F. A. Pozzi, "Sentiment analysis: Bayesian Ensemble Learning," Decision Support Systems, Vol. 68, 26-38, 2014. crossref(new window)

J. K. Bae, "An integrated approach to predict corporate bankruptcy with voting algorithms and neural networks," Korean Business Review, Vol. 3, No. 2, pp. 79-101, 2010.

C. W. Yang, "Comparisons of Liquidity Measures in the Korean Stock Market," Asian Review of Financial Research, Vol. 25, No. 1, pp. 37-88, 2012.

P. M. Dechow, R. G. Sloan and A. P. Sweeney, "Detecting earnings management," the Accounting Review, Vol. 70, No. 2, pp. 193-225, 1995.

J. Han, M. Kamber and J. Pei, "Data mining. concepts and techniques," Morgan Kaufmann, 2012.

K. S. Cho, S. H. Lee and J. J. Kim, "Influence of Overseas Construction Business on Construction Companies' Financial Stability," Korean journal of construction engineering and management, Vol. 14, No. 1, pp. 43-51, 2013. crossref(new window)

K. J. Kim and H. S Kim, "A Study on the Characteristics of Asymmetric Volatility by Industry in Korean Stock Market ", Korean Journal of Business Administration, Vol. 21, No. 6, pp. 2947-2964, 2008.