DOI QR코드

DOI QR Code

Influence analysis of Internet buzz to corporate performance : Individual stock price prediction using sentiment analysis of online news

온라인 언급이 기업 성과에 미치는 영향 분석 : 뉴스 감성분석을 통한 기업별 주가 예측

  • Received : 2015.11.05
  • Accepted : 2015.12.06
  • Published : 2015.12.30

Abstract

Due to the development of internet technology and the rapid increase of internet data, various studies are actively conducted on how to use and analyze internet data for various purposes. In particular, in recent years, a number of studies have been performed on the applications of text mining techniques in order to overcome the limitations of the current application of structured data. Especially, there are various studies on sentimental analysis to score opinions based on the distribution of polarity such as positivity or negativity of vocabularies or sentences of the texts in documents. As a part of such studies, this study tries to predict ups and downs of stock prices of companies by performing sentimental analysis on news contexts of the particular companies in the Internet. A variety of news on companies is produced online by different economic agents, and it is diffused quickly and accessed easily in the Internet. So, based on inefficient market hypothesis, we can expect that news information of an individual company can be used to predict the fluctuations of stock prices of the company if we apply proper data analysis techniques. However, as the areas of corporate management activity are different, an analysis considering characteristics of each company is required in the analysis of text data based on machine-learning. In addition, since the news including positive or negative information on certain companies have various impacts on other companies or industry fields, an analysis for the prediction of the stock price of each company is necessary. Therefore, this study attempted to predict changes in the stock prices of the individual companies that applied a sentimental analysis of the online news data. Accordingly, this study chose top company in KOSPI 200 as the subjects of the analysis, and collected and analyzed online news data by each company produced for two years on a representative domestic search portal service, Naver. In addition, considering the differences in the meanings of vocabularies for each of the certain economic subjects, it aims to improve performance by building up a lexicon for each individual company and applying that to an analysis. As a result of the analysis, the accuracy of the prediction by each company are different, and the prediction accurate rate turned out to be 56% on average. Comparing the accuracy of the prediction of stock prices on industry sectors, 'energy/chemical', 'consumer goods for living' and 'consumer discretionary' showed a relatively higher accuracy of the prediction of stock prices than other industries, while it was found that the sectors such as 'information technology' and 'shipbuilding/transportation' industry had lower accuracy of prediction. The number of the representative companies in each industry collected was five each, so it is somewhat difficult to generalize, but it could be confirmed that there was a difference in the accuracy of the prediction of stock prices depending on industry sectors. In addition, at the individual company level, the companies such as 'Kangwon Land', 'KT & G' and 'SK Innovation' showed a relatively higher prediction accuracy as compared to other companies, while it showed that the companies such as 'Young Poong', 'LG', 'Samsung Life Insurance', and 'Doosan' had a low prediction accuracy of less than 50%. In this paper, we performed an analysis of the share price performance relative to the prediction of individual companies through the vocabulary of pre-built company to take advantage of the online news information. In this paper, we aim to improve performance of the stock prices prediction, applying online news information, through the stock price prediction of individual companies. Based on this, in the future, it will be possible to find ways to increase the stock price prediction accuracy by complementing the problem of unnecessary words that are added to the sentiment dictionary.

Keywords

Stock Prediction;Sentiment Analysis;Predictive Analytics

References

  1. Bank, M., M. Larch, and G. Peter, "Google Search Volume and Its Influence on Liquidity and Returns of German Stocks," Financial Markets and Portfolio Management, Vol.25, No.3(2011), 239-264. https://doi.org/10.1007/s11408-011-0165-y
  2. Bollen, J., H. Mao, and X. Zeng, "Twitter Mood Predicts the Stock Market," Journal of Computational Science, Vol.2, No.1(2011), 1-8. https://doi.org/10.1016/j.jocs.2010.12.007
  3. de Fortuny, E. J., T. De Smedt, D. Martens, and W. Daelemans, "Evaluating and Understanding Text-Based Stock Price Prediction Models," Information Processing & Management, Vol.50, No.2(2014), 426-441. https://doi.org/10.1016/j.ipm.2013.12.002
  4. Evangelopoulos, N., M. J. Magro, and A. Sidorova, "The Dual Micro/Macro Informing Role of Social Network Sites: Can Twitter Macro Messages Help Predict Stock Prices?," Informing Science: the International Journal of an Emerging Transdiscipline, Vol.15(2012), 247-268. https://doi.org/10.28945/1739
  5. Jo, E. K., "The Current State of Affairs of the Sentiment Analysis and Case Study Based on Corpus," The Journal of Linguistic Science, Vol.61(2012), 259-282.
  6. Jo, H. J., J. H. Seo, and J. T. Choi, "OAR Algorithm Technology Based on Opinion Mining Utilizing Stock News Contents," Journal of Korea Institute of Information Technology, Vol.13, No.3(2015), 111-119.
  7. Kim, D. S. and J. W. Kim, "Public Opinion Sensing and Trend Analysis on Social Media: A Study on Nuclear Power on Twitter," International Journal of Multimedia and Ubiquitous Engineering, Vol.9, No.11(2014), 373-384.
  8. Kim, S. W. and N. G. Kim, "A Study on the Effect of Using Sentment Lexicon in Opinion Classification," Journal of Intelligence and Information Systems, Vol.20, No.1(2014), 133-148.
  9. Kim, Y. M., S. J. Jeong, and S. J. Lee, "A Study on the Stock Market Prediction Based on Sentiment Analysis of Social Media," Enture Journal of Information Technology, Vol.13, No.3(2014), 59-70.
  10. Kim, Y. S., N. G. Kim, and S. R. Jeong, "Stock-Index Invest Model Using News Big Data Opinion Mining," Journal of Intelligence and Information Systems, Vol.18, No.2(2012), 143-156.
  11. LaValle, S., E. Lesser, R. Shockley, M. S. Hopkins, and N. Kruschwitz, "Big Data, Analytics and the Path from Insights to Value," MIT Sloan Management Review, Vol.52, No.2(2013), 21-31.
  12. Lee, J., E. Lapira, B. Bagheri, and H. A. Kao, "Recent Advances and Trends in Predictive Manufacturing Systems in Big Data Environment," Manufacturing Letters, Vol.1, No.1(2013), 38-41. https://doi.org/10.1016/j.mfglet.2013.09.005
  13. Moon, H. N. and J. W. Kim, "A Study on the Individual Stock Price Prediction Using the Internet News(written in Korean)," Proceedings of 2014 Korea Intelligent Information Systems Society Spring Conference, (2014), 387-393.
  14. Schumaker, R. P. and H. Chen, "A Quantitative Stock Prediction System Based on Financial News," Information Processing & Management, Vol.45, No.5(2009), 571-583. https://doi.org/10.1016/j.ipm.2009.05.001
  15. Schumaker, R. P., Y. Zhang, C. N. Huang, and H. Chen, "Evaluating Sentiment in Financial News Articles," Decision Support Systems, Vol.53, No.3(2012), 458-464. https://doi.org/10.1016/j.dss.2012.03.001
  16. Song, S. I., D. J. Lee, and S. G. Lee, "Identifying Sentiment Polarity of Korean Vocabulary Using PMI," Proceeding of Korea Computer Congress, Vol.37, No.1(2010), 260-265.
  17. Yu, E. J., Y. S. Kim, N. G. Kim, and S. R. Jeong, "Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary," Journal of Intelligence and Information Systems, Vol.19, No.1(2013), 95-110. https://doi.org/10.13088/jiis.2013.19.1.095
  18. Waller, M. A. and S. E. Fawcett, "Data Science, Predictive Analytics, and Big Data: a Revolution That Will Transform Supply Chain Design and Management," Journal of Business Logistics, Vol.34, No.2(2013), 77-84. https://doi.org/10.1111/jbl.12010

Acknowledgement

Supported by : 한국연구재단