참고문헌
- M.H. Savoji, "A Robust Algorithm for Accurate End Pointing of Speech," Speech Commun., 1989, vol. 8, no. 1, pp. 45-60. https://doi.org/10.1016/0167-6393(89)90067-8
- T. Kristjansson, S. Deligne, and P. Olsen, "Voicing Features for Robust Speech Detection," Interspeech, 2005, pp. 369-372.
- R.E. Yantorno, K.L. Krishnamachari, and J.M. Lovekin, "The Spectral Autocorrelation Peak Valley Ratio (SAPVR): A Usable Speech Measure Employed as a Co-channel Detection System," IEEE Int. Workshop Intell. Signal Process., 2001, pp. 193-197.
- J.L. Shen, J.W. Hung, and L.S. Lee, "Robust Entropy Based Endpoint Detection for Speech Recognition in Noisy Environments," ICSP, 1998, pp. 232-235.
- A. Benyassine et al., "ITU-T Recommendation G.729 Annex B: A Silence Compression Scheme for Use with G.729 Optimized for V.70 Digital Simultaneous Voice and Data Applications," IEEE Commun. Mag., vol. 35, 1997, pp. 64-73.
- M. Marzinzik and B. Kollmeier, "Speech Pause Detection for Noise Spectrum Estimation by Tracking Power Envelope Dynamics," IEEE Trans. Speech Audio Process., vol. 10, 2002, pp. 109-118. https://doi.org/10.1109/89.985548
- J. Ram irez et al., "Efficient Voice Activity Detection Algorithms Using Long-Term Speech Information," Speech Commun., 2004, vol. 42, pp. 271-287. https://doi.org/10.1016/j.specom.2003.10.002
- B.F. Wu and K.C. Wang, "Robust Endpoint Detection Algorithm Based on the Adaptive Band Partitioning Spectral Entropy in Adverse Environments," IEEE Trans. Speech Audio Process., vol. 13, 2005, pp. 762-775. https://doi.org/10.1109/TSA.2005.851909
- S. Ahmadi and A.S. Spanias, "Cepstrum-Based Pitch Detection Using a New Statistical V/UV Classification Algorithm," IEEE Trans. Speech Audio Process., vol. 7, 1999, pp. 333-338. https://doi.org/10.1109/89.759042
- Y. Tian, Z. Wang, and D. Lu, "Non-Speech Segment Rejection Based on Prosodic Information for Robust Speech Recognition," IEEE Signal Process. Lett., vol. 9, no. 11, 2002, pp. 364-367. https://doi.org/10.1109/LSP.2002.804564
- K. Ishizuka et al., "Noise Robust Voice Activity Detection Based on Periodic to Aperiodic Component Ratio," Speech Commun., vol. 52, 2010, pp. 41-60. https://doi.org/10.1016/j.specom.2009.08.003
- S. Shafiee et al., "A Two-Stage Speech Activity Detection System Considering Fractal Aspects of Prosody," Pattern Recog. Lett., 2010.
- M. Fujimoto and K. Ishizuka, "Noise Robust Voice Activity Detection Based on Switching Kalman Filter," IEICE Trans. Inf. Syst., 2008, E91-D, pp. 467-477. https://doi.org/10.1093/ietisy/e91-d.3.467
- A. Agarwal and Y.M. Cheng, "Two-Stage Mel-Warped Wiener Filter for Robust Speech Recognition," IEEE Workshop Auto. Speech Recog. Understanding, 1999, pp. 67-70.
- D. Cournapeau and T. Kawahara, "Evaluation of Real-Time Voice Activity Detection Based on High Order Statistics," Interspeech, 2007, pp. 2945-2949.
- H. Kato Solvang, K. Ishizuka, and M. Fujimoto, "Voice Activity detection Based on Adjustable Linear Prediction and GARCH Models," Speech Commun., 2008, vol. 50, pp. 476-486. https://doi.org/10.1016/j.specom.2008.02.003
- M.H. Moattar and M.M. Homayounpour, "A Simple but Efficient Real-Time Voice Activity Detection Algorithm," Eusipco, 2009, pp. 2549-2553.
- I.C. Yoo and D. Yook, "Robust Voice Activity Detection Using the Spectral Peaks of Vowel Sounds," ETRI J., vol. 31, no. 4, 2009, pp. 451-453 https://doi.org/10.4218/etrij.09.0209.0104
- M.H. Moattar, M.M. Homayounpour, and N.K. Kalantari, "A New Approach for Robust Realtime Voice Activity Detection Using Spectral Pattern," ICASSP, 2010, pp. 4478-4481.
- J.S. Garofalo et al., DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM, Linguistic Data Consortium, 1993.
- M. Bijankhan and M.J. Sheikhzadegan, "FARSDAT- the Farsi Spoken Language Database," 5th Australian Int. Conf. Speech Sci. Technol., 1994, vol. 2, pp. 826-829.
- H.G. Hirsch and D. Pearce, "The AURORA Experimental Framework for the Performance Evaluation of Speech Recognition Systems under Noise Conditions," ISCA ITRW, 2000, pp. 181-188.
- A.P. Varga et al., "The NOISEX-92 Study on the Effect of Additive Noise on Automatic Speech Recognition," Technical report, DRA Speech Research Unit, 1992.
- B. Lee and M. Hasegawa-Johnson, "Minimum Mean Squared Error A Posteriori Estimation of High Variance Vehicular Noise," Biennial DSP In-Vehicle Mobile Syst., 2007.
- ETSI, Digital Cellular Telecommunications Systems (Phase 2+); Voice Activity Detector (VAD) for Adaptive Multi-Rate (AMR) Speech Traffic Channels, GSM 06.94, version 7.1.1, EN 301 708, 1999.
- ETSI, Speech Processing, Transmission, and Quality Aspects (STQ), Distributed Speech Recognition, Advanced Front-End Feature Extraction Algorithm, Compression Algorithms, version 1.1.1, ES 202 050, 2001.
피인용 문헌
- A Hierarchical Framework Approach for Voice Activity Detection and Speech Enhancement vol.2014, pp.None, 2011, https://doi.org/10.1155/2014/723643
- Manifold learning based speaker dependent dimension reduction for robust text independent speaker verification vol.17, pp.3, 2014, https://doi.org/10.1007/s10772-014-9228-6
- Formant-Based Robust Voice Activity Detection vol.23, pp.12, 2011, https://doi.org/10.1109/taslp.2015.2476762
- Efficient harmonic peak detection of vowel sounds for enhanced voice activity detection vol.12, pp.8, 2011, https://doi.org/10.1049/iet-spr.2017.0553