Forecasting Symbolic Candle Chart-Valued Time Series

  • Park, Heewon ;
  • Sakaori, Fumitake
  • Received : 2014.07.06
  • Accepted : 2014.11.08
  • Published : 2014.11.30


This study introduces a new type of symbolic data, a candle chart-valued time series. We aggregate four stock indices (i.e., open, close, highest and lowest) as a one data point to summarize a huge amount of data. In other words, we consider a candle chart, which is constructed by open, close, highest and lowest stock indices, as a type of symbolic data for a long period. The proposed candle chart-valued time series effectively summarize and visualize a huge data set of stock indices to easily understand a change in stock indices. We also propose novel approaches for the candle chart-valued time series modeling based on a combination of two midpoints and two half ranges between the highest and the lowest indices, and between the open and the close indices. Furthermore, we propose three types of sum of square for estimation of the candle chart valued-time series model. The proposed methods take into account of information from not only ordinary data, but also from interval of object, and thus can effectively perform for time series modeling (e.g., forecasting future stock index). To evaluate the proposed methods, we describe real data analysis consisting of the stock market indices of five major Asian countries'. We can see thorough the results that the proposed approaches outperform for forecasting future stock indices compared with classical data analysis.


Candle chart;Symbolic data analysis;interval-valued data;time series;stock market indices of major Asian countries'


  1. Billard, L. and Diday, E. (2000). Regression Analysis for Interval-Valued Data, in Data Analysis, Classification, and Related Methods, Studies in Classification, Data Analysis, and Knowledge Organization, eds. H. A. L. Kiers, J. P. Rassoon, P. J. F. Groenen and M. Schader, Springer-Verlag, Berlin, 369-374.
  2. Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle, B. N. Petrov, F. Csaki (Eds.), Proceedings of the 2nd International Symposium on Information Theory, Akademiai Kiado, Budapest, 267-281.
  3. Arroyo, J., Gonzalez-Rivera, G. and Mate, C. (2009). Forecasting with interval and histogram data, Some financial applications, Handbook of Empirical Economics and Finance, Aman Ullah and David E. A. Giles, eds. Chapman and Hall/CRC 2010, 247-279.
  4. Billard, L. (2008). Some analyses of interval data, Journal of Computing and Information Technology, 4, 225-233.
  5. Chen, K., Jayaprakash, C. and Yuan, B. (2005). Conditional probability as a measure of volatility clustering in financial time series,
  6. Diday, E. and Noirhomme-Fraiture, M. (2008). Symbolic Data Analysis and the SODAS Software, Wiley-Interscience.
  7. Giordani, P. (2011). Linear regression analysis for interval-valued data based on the Lasso technique, Proceeding of 58th World Statistical Congress, Dublin, 5576-5581.
  8. Goswamil, M. M., Bhensdadia, C. K. and Ganatra, A. P. (2009). Candlestick analysis based short term prediction of stock price fluctuation using SOM-CBR, 2009 IEEE International Advance Computing Conference, 1448-1452.
  9. Hansen, J. M. and Nelson, R. D. (2003). Time-series analysis with neural networks and ARIMA-neural network hybrids, Journal of Experimental & Theoretical Artificial Intelligence, 15, 315-330.
  10. Lima Neto, E. A., De Carvalho, F. A. T. and Bezerra, L. X. T. (2006). Linear regression methods to predict interval-valued Data, Neural Networks, SBRN '06. Ninth Brazilian Symposium on, 125-130 .
  11. Lima Neto, E. A., De Carvalho, F. A. T. (2008). Centre and range method for fitting a linear regression model on symbolic interval data, Computational Statistic Data Analysis, 52, 1500-1515.
  12. Lima Neto, E. A. and De Carvalho, F. A. T. (2010). Constrained linear regression models for symbolic interval-valued variables, Computational Statistic Data Analysis, 54, 333-347.
  13. Maia, A. L. S., De Carvalho, F. A. T. and Ludermir, T. B. (2008). Forecasting models for interval-valued time series, Neurocomputing, 71, 3344-3352.
  14. Noirhomme-Fraiture, M. and Brito, P. (2011). Far beyond the classical data models: Symbolic data analysis, Statistical Analysis and Data Mining: The ASA Data Science Journal, 4, 157-170.
  15. Park, H. and Sakaori, F. (2013). Lag weighted lasso for time series models, Computational Statistics, 28, 493-504.
  16. Wei, W. W. S. (2005). Time Series Analysis: Univariate and Multivariate Methods, Addison Wesley, New York.
  17. Zhang, G. P. (2001). Time series forecasting using a hybrid ARIMA and neural network model, Neurocomputing, 50, 159-175.