JOURNAL BROWSE
Search
Advanced SearchSearch Tips
Input Dimension Reduction based on Continuous Word Vector for Deep Neural Network Language Model
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
  • Journal title : Phonetics and Speech Sciences
  • Volume 7, Issue 4,  2015, pp.3-8
  • Publisher : The Korean Society of Speech Sciences
  • DOI : 10.13064/KSSS.2015.7.4.003
 Title & Authors
Input Dimension Reduction based on Continuous Word Vector for Deep Neural Network Language Model
Kim, Kwang-Ho; Lee, Donghyun; Lim, Minkyu; Kim, Ji-Hwan;
  PDF(new window)
 Abstract
In this paper, we investigate an input dimension reduction method using continuous word vector in deep neural network language model. In the proposed method, continuous word vectors were generated by using Google`s Word2Vec from a large training corpus to satisfy distributional hypothesis. 1-of- coding discrete word vectors were replaced with their corresponding continuous word vectors. In our implementation, the input dimension was successfully reduced from 20,000 to 600 when a tri-gram language model is used with a vocabulary of 20,000 words. The total amount of time in training was reduced from 30 days to 14 days for Wall Street Journal training corpus (corpus length: 37M words).
 Keywords
deep neural network;language model;continuous word vector;input dimension reduction;
 Language
Korean
 Cited by
1.
음성인식 기반 응급상황관제,이규환;정지오;신대진;정민화;강경희;장윤희;장경호;

말소리와 음성과학, 2016. vol.8. 2, pp.31-39 crossref(new window)
 References
1.
Bengio, Y., Ducharme, R., Vincent, P. and Jauvin, C. (2003). A neural probabilistic language model, Journal of Machine Learning Research, Vol. 3, 1137-1155.

2.
Bengio, Y. (2009). Learning deep architectures for AI, Journal of Foundations and Trends in Machine Learning, Vol. 2, No. 1, 1-127. crossref(new window)

3.
Schwenk, H. & Gauvain, J. (2005). Training neural network language models on very large corpora, in Proc. Empirical Methods in Natural Language Processing, 201-208.

4.
Arisoy, E., Sainath, T., Kingsbury, B. and Ramabhadran, B. (2012). Deep neural network language models, in Proc. NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT, 20-28.

5.
Turney, P. & Pantel, P. (2010) From frequency to meaning: vector space models of semantics, Journal of Artificial Intelligence Research, Vol. 37, No. 1, 141-188. crossref(new window)

6.
Schutze, H. & Pedersen, J. (1995). Information retrieval based on word sense, in Proc. Symposium on Document Analysis and Information Retrieval, 161-175.

7.
Rubenstein, H. & Goodenough, J. (1965) Contextual correlates of synonymy, Communications of the ACM, Vol. 8, No. 10, 627-633. crossref(new window)

8.
Bruni, E., Boleda, G., Baroni, M. and Tran, N. (2012). Distributional semantics in technicolor, in Proc. 50th Annual Meeting of the Associations for Computational Linguistics, 136-145.

9.
Mikolov, T. (2013). Word2Vec, https://code.google.com/p/word2vec.

10.
Faruqui, M. & Dyer, C. (2014). Community evaluation and exchange of word vectors at wordvectors.org, in Proc. Associations for Computational Linguistics, 1-6.

11.
Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G. and Ruppin, E. (2001). Placing search in context: the concept revisited, in Proc. The Tenth International World Wide Web Conference, 406-414.

12.
Bruni, E., Boleda, G., Baroni, M. and Tran, N. (2012). Distributional semantics in technicolor, in Proc. 50th Annual Meeting of the Associations for Computational Linguistics, 136-145.

13.
Luong, M., Socher, R. and Manning, C. (2013). Better word representations with recursive neural networks for morphology, in Proc. Computational Natural Language Learning, 1-10.