참고문헌
- F. Seide, G. Li, and D. Yu, Conversational speech transcription using context-dependent deep neural networks, Annu. Conf. Int. Speech Commun. Assoc., Florence, Italy, Aug. 27-31, 2011, 437-440.
- G. Hinton et al., Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, IEEE Signal Process 29 (2012), no. 6, 82-97. https://doi.org/10.1109/MSP.2012.2205597
- M. Gales, Maximum likelihood linear transformations for HMMbased speech recognition, Comput. Speech Language 12 (1998), no. 2, 75-98. https://doi.org/10.1006/csla.1998.0043
- F. Seide et al., Feature engineering in context-dependent deep neural networks for conversational speech transcription, IEEE Workshop Autom. Speech Recogn. Understanding, Waikoloa, HI, USA, Dec. 11-15, 2011, pp. 24-29.
- J. Stadermann and G. Rigoll, Two-stage speaker adaptation of hybrid tied-posterior acoustic models, Proc. IEEE Int. Conf. Acoust. Speech Signal Process., Philadelphia, PA, USA, Mar. 23-25, 2005, pp. 977-980.
- D. Albesano et al., Adaptation of artificial neural networks avoiding catastrophic forgetting, IEEE Int. Joint Conf. Neural Network Proc., Vancouver, Canada, July 16-21, 2006, pp. 1554-1561.
- S. Xue et al., Fast adaptation of deep neural network based on discriminant codes for speech recognition, IEEE/ACM Trans. Audio, Speech, Language Process 22 (2014) no. 12, 1713-1725. https://doi.org/10.1109/TASLP.2014.2346313
- T. Tan, Y. Qian, and K. Yu, Cluster adaptive training for deep neural network based acoustic model, IEEE/ACM Trans. Audio, Speech, Language Process 24 (2016) no. 3, 459-468. https://doi.org/10.1109/TASLP.2015.2511922
- G. Saon et al., Speaker adaptation of neural network acoustic models using i-vectors, IEEE Workshop Autom Speech Recogn. Understanding, Olomouc, Czech Republic, Dec. 8-12, 2013, pp. 55-59.
- A. Senior and I. Lopez-Moreno, Improving DNN speaker independence with i-vector inputs, IEEE Int. Conf. Acoust. Speech Signal Process., Florence, Italy, May 4-9, 2014, pp. 225-229.
- J. Trmal, J. Zelinka, and L. Muller, Adaptation of a feedforward artificial neural network using a linear transform, Int. Conf. Text Speech Dialogue Proc., Brno, Czech Republic, Sept. 6-10, 2010, pp. 423-430.
- B. Li and K.C. Sim, Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems, Annu. Conf. Int. Speech Commun. Assoc., Makuhari, Japan, Sept. 26-30, 2010, pp. 526-529.
- R. Gemello et al., Adaptation of hybrid ANN/HMM models using linear hidden transformations and conservative training, IEEE Int. Conf. Acoust. Speech Signal Process. Proc., Toulouse, France, May 14-19, 2006, pp. 1189-1192.
- K. Yao et al., Adaptation of context-dependent deep neural networks for automatic speech recognition, IEEE Spoken Language Technol. Workshop, Miami, FL, USA, Dec. 2-5, 2012, pp. 366-369.
- Y. Zhao et al., Investigating online low-footprint speaker adaptation using generalized linear regression and click-through data, IEEE Int. Conf. Acoust. Speech Signal Process., Brisbane, Australia, Apr. 19-24, 2015, pp. 4310-4314.
- X. Li and J. Bilmes, Regularized adaptation of discriminative classifiers, IEEE Int. Conf. Acoust. Speech Signal Process., Toulouse, France, May 14-19, 2006, pp. 237-240.
- D. Yu et al., KL-divergence regularized deep neural network adaptation improved large vocabulary speech recognition, IEEE Int. Conf. Acoust. Speech Signal Process., Vancouver, Canada, May 26-31, 2013, pp. 7893-7897.
- P. Swietojanski and S. Renals, Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models, IEEE Spoken Language Technol. Workshop, South Lake Tahoe, NV, USA, Dec. 7-10, 2014, pp. 171-176.
- J. Xue et al., Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network, IEEE Int. Conf. Acoust. Speech Signal Process., Florence, Italy, May 4-9, 2014, pp. 6359-6363.
- K. Kumar et al., Intermediate-layer DNN Adaptation for offline and session-based iterative speaker adaptation, Annu. Conf. Int. Speech Commun. Assoc., Dresden, Germany, Sept. 6-10, 2015, pp. 1091-1095.
- Y. Zhao, J. Li, and Y. Gong, Low-rank plus diagonal adaptation for deep neural networks, IEEE Int. Conf. Acoust. Speech Signal Process., Shanghai, China, Mar. 20-25, 2016, pp. 5005-5009.
- D. Yu and L. Deng, Automatic speech recognition: A deep learning approach, Spinger-Verlag London, UK, 2015, pp. 57-65.
- I. Sutskever et al., On the importance of initialization and momentum in deep learning, Proc. Int. Conf. Mach. Learn., Atlanta, GA, USA, June 16-21, 2013, pp. 1139-1147.
- S. Pan and Q. Yang, A survey on transfer learning, IEEE Trans. Knowl. Data Eng. 22 (2010), no. 10, 1345-1359. https://doi.org/10.1109/TKDE.2009.191
- J. Huang et al., Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers, IEEE Int. Conf. Acoust. Speech Signal Process., Vancouver, Canada, May 26-31, 2013, pp. 7304-7308.
- D. Povey et al., The Kaldi speech recognition toolkit, IEEE Workshop Autom. Speech Recogn. Understanding, Waikoloa, HI, USA, Dec. 11-15, 2011.
- Y. Miao, H. Zhang, and F. Metze, Speaker adaptive training of deep neural network acoustic models using i-vectors, IEEE/ACM Trans. Audio, Speech, Language Process. 23 (2015), no. 11, 1938-1949. https://doi.org/10.1109/TASLP.2015.2457612
- D. Snyder, D. Garcia-Romero, and D. Povey, Time delay deep neural network-based universal background models for speaker recognition, IEEE Workshop Autom. Speech Recogn. Understanding, Scottsdale, AZ, USA, Dec. 13-17, 2015, pp. 92-97.
피인용 문헌
- Simultaneous neural machine translation with a reinforced attention mechanism vol.43, pp.5, 2019, https://doi.org/10.4218/etrij.2020-0358