DOI QR코드

DOI QR Code

Text Categorization with Improved Deep Learning Methods

  • Wang, Xingfeng (Information Engineering College, Eastern Liaoning University) ;
  • Kim, Hee-Cheol (Department of Computer Engineering & Institute of Digital Anti-Aging Healthcare (IDA), Inje University)
  • Received : 2018.03.14
  • Accepted : 2018.06.14
  • Published : 2018.06.30

Abstract

Although deep learning methods of convolutional neural networks (CNNs) and long-/short-term memory (LSTM) are widely used for text categorization, they still have certain shortcomings. CNNs require that the text retain some order, that the pooling lengths be identical, and that collateral analysis is impossible; In case of LSTM, it requires the unidirectional operation and the inputs/outputs are very complex. Against these problems, we thus improved these traditional deep learning methods in the following ways: We created collateral CNNs accepting disorder and variable-length pooling, and we removed the input/output gates when creating bidirectional LSTMs. We have used four benchmark datasets for topic and sentiment classification using the new methods that we propose. The best results were obtained by combining LTSM regional embeddings with data convolution. Our method is better than all previous methods (including deep learning methods) in terms of topic and sentiment classification.

Keywords

References

  1. M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz, "A Bayesian approach to filtering junk e-mail," in Proceedings of AAAI'98 Workshop on Learning for Text Categorization, Madison, WI, 1998.
  2. B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up? Sentiment classification using machine learning techniques," in Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, Philadelphia, PA, pp. 79-86, 2002. DOI: 10.3115/1118693.1118704.
  3. B. Pang and L. Lee, "Opinion mining and sentiment analysis," Foundations and Trends in Information Retrieval, vol. 2, no. 1-2, pp. 1-135, 2008. DOI: 10.1561/1500000011.
  4. B. Li, N. Chen, J. Wen, X. Jin, and Y. Shi, "Text categorization system for stock prediction," International Journal of u- and e-Service Science and Technology, vol. 8, no. 2, pp. 35-44, 2015. DOI: 10.14257/ijunnesst.2015.8.2.04.
  5. X. Wang and H. C. Kim, "New feature selection method for text categorization," Journal of Information and Communication Convergence Engineering, vol. 15, no. 1, pp. 53-61, 2017. DOI: 10.6109/jicce.2017.15.1.53.
  6. A. McCallum and K. Nigam, "A comparison of event models for naïve Bayes text classification," in Proceedings of AAAI'98 Workshop on Learning for Text Categorization, Madison, WI, 1998.
  7. P. Soucy and G. W. Mineau, "A simple KNN algorithm for text categorization," in Proceedings IEEE International Conference on Data Mining, San Jose, CA, pp. 647-648, 2001. DOI: 10.1109/ICDM.2001.989592
  8. T. Joachims, "Transductive inference for text classification using support vector machines," in Proceedings of the 16th International Conference on Machine Learning, Bled, Slovenia, pp. 200-209, 1999.
  9. S. Lai, L. Xu, K. Liu, and J. Zhao, "Recurrent convolutional neural networks for text classification," in Proceedings of the 29th AAAI Conference on Artificial Intelligence, Austin, TX, pp. 2267-2273, 2015.
  10. J. Weston, S. Chopra, and K. Adams, "#tagspace: semantic embeddings from hashtags," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1822-1827, 2014.
  11. S. Hochreiter and J. Schmidhuder, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997. DOI: 10.1162/neco.1997.9.8.1735.
  12. A. Deshpande, "A beginner's guide to understanding convolutional neural networks," 2016 [Internet], Available: https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/.
  13. C. Olah, "Understanding LSTM networks," 2015 [Internet], Available: https://colah.github.io/posts/2015-08-Understanding-LSTMs/.
  14. L. Xu, K. Liu, S. Lai, and J. Zhao, "Product feature mining: Semantic clues versus syntactic constituents," in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, pp. 336-346, 2014.
  15. Y. Kim, "Convolutional neural networks for sentence classification," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1746-1751, 2014.
  16. K. Tai, S. Richard, and M. Christopher, "Improved semantic representations from tree-structured long short-term memory networks," in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, China, pp. 1556-1566, 2015.
  17. K. Cho, B. Van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, Y. (2014). "Learning phrase representations using RNN encoder-decoder for statistical machine translation," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1724-1734, 2014.
  18. A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, "Learning word vectors for sentiment analysis," in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, pp. 142-150, 2011.
  19. J. McAuley and J. Leskovec, "Hidden factors and hidden topics: understanding rating dimensions with review text," in in Proceedings of the 7th ACM Conference on Recommender Systems, Hong Kong, China, pp. 165-172, 2013.
  20. D. D. Lewis, Y. Yang, T. G. Rose, and F. Li, "RCV1: a new benchmark collection for text categorization research," Journal of Machine Learning Research, vol. 5. pp. 361-397, 2004.
  21. J. Gao, P. Pantel, M. Gamon, X. He, and D. Li, "Modeling interestingness with deep neural networks," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 2-13, 2014.
  22. N. Kalchbrenner, E. Grefenstette, and P. Blunsom, "A convolutional neural network for modeling sentences," in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, pp. 655-665, 2014.
  23. Q. Le and T. Mikolov, "Distributed representations of sentences and documents," in Proceedings of the 31st International Conference on International Conference on Machine Learning, Beijing, China, pp. 1188-1196, 2014.
  24. P. Le and W. Zuidema, "Compositional distributional semantics with long short-term memory," in Proceedings of the 4th Joint Conference on Lexical and Computational Semantics, Denver, CO, pp. 10-19, 2015.
  25. X. Zhu, P. Sobhani, and H. Guo, "Long short-term memory over recursive structures," in Proceedings of the 32nd International Conference on Machine Learning, Lille, France, pp. 1604-1612, 2015.

Cited by

  1. Pre-processing Method of Raw Data Based on Ontology for Machine Learning vol.24, pp.5, 2020, https://doi.org/10.6109/jkiice.2020.24.5.600
  2. NTIS 시스템에서 딥러닝과 형태소 분석 기반의 대화형 검색 서비스 설계 및 구현 vol.10, pp.12, 2018, https://doi.org/10.22156/cs4smb.2020.10.12.009
  3. Deep Learning Document Analysis System Based on Keyword Frequency and Section Centrality Analysis vol.19, pp.1, 2018, https://doi.org/10.6109/jicce.2021.19.1.48
  4. Text categorization based on a new classification by thresholds vol.10, pp.4, 2018, https://doi.org/10.1007/s13748-021-00247-1