DOI QR코드

DOI QR Code

Speech Query Recognition for Tamil Language Using Wavelet and Wavelet Packets

  • Iswarya, P. (Dept. of Computer Science, Avinashilingam Institute for Home science and Higher Education for Women) ;
  • Radha, V. (Dept. of Computer Science, Avinashilingam Institute for Home science and Higher Education for Women)
  • Received : 2014.07.29
  • Accepted : 2015.08.05
  • Published : 2017.10.31

Abstract

Speech recognition is one of the fascinating fields in the area of Computer science. Accuracy of speech recognition system may reduce due to the presence of noise present in speech signal. Therefore noise removal is an essential step in Automatic Speech Recognition (ASR) system and this paper proposes a new technique called combined thresholding for noise removal. Feature extraction is process of converting acoustic signal into most valuable set of parameters. This paper also concentrates on improving Mel Frequency Cepstral Coefficients (MFCC) features by introducing Discrete Wavelet Packet Transform (DWPT) in the place of Discrete Fourier Transformation (DFT) block to provide an efficient signal analysis. The feature vector is varied in size, for choosing the correct length of feature vector Self Organizing Map (SOM) is used. As a single classifier does not provide enough accuracy, so this research proposes an Ensemble Support Vector Machine (ESVM) classifier where the fixed length feature vector from SOM is given as input, termed as ESVM_SOM. The experimental results showed that the proposed methods provide better results than the existing methods.

Keywords

References

  1. A. G. Chitu, L. J. Rothkrantz, P. Wiggers, and J. C. Wojdel, "Comparison between different feature extraction techniques for audio-visual speech recognition," Journal on Multimodal User Interfaces, vol. 1, no. 1, pp. 7-20, 2007. https://doi.org/10.1007/BF02884428
  2. R. Aggarwal, J. K. Singh, V. K. Gupta, S. Rathore, M. Tiwari, and A. Khare, "Noise reduction of speech signal using wavelet transform with modified universal threshold," International Journal of Computer Applications, vol. 20, no. 5, pp. 14-19, 2011. https://doi.org/10.5120/2431-3269
  3. R. Sarikaya, B. L. Pellom, and J. H. Hansen, "Wavelet packet transform features with application to speaker identification," in Proceedings of 3rd IEEE Nordic Signal Processing Symposium, Vigso, Denmark, 1998, pp. 81-84.
  4. J. N. Gowdy and Z. Tufekci, "Mel-scaled discrete wavelet coefficients for speech recognition," in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Istanbul, Turkey, 2000, pp. 1351-1354.
  5. N. S. Nehe and R. S. Holambe, "DWT and LPC based feature extraction methods for isolated word recognition," EURASIP Journal on Audio, Speech, and Music Processing, vol. 2012, no. 1, pp. 1-7, 2012. https://doi.org/10.1186/1687-4722-2012-1
  6. B. Bharathi, V. Deepalakshmi, and I. Nelson, "A neural network based speech recognition system for isolated Tamil words," in Proceedings of International Conference on Neural Networks and Artificial Intelligence, Brest, Belarus, 2006.
  7. S. Saraswathi and T. Geetha, "Morpheme based language model for Tamil speech recognition system," International Arab Journal of Information Technology, vol. 4, no. 3, pp. 214-219, 2007.
  8. R. Arun Thilak and R. Madharaci, "Speech recognizer for Tamil language," in Proceedings of Tamil Internet Conference, Singapore, 2004, pp. 1-7.
  9. V. Radha, C. Vimala, and M. Krishnaveni, "Isolated word recognition system for Tamil spoken language using back propagation neural network based on LPCC features," Computer Science & Engineering, vol. 1, no. 4, pp. 1-11, 2011.
  10. I. Patel and Y. S. Rao, "Speech recognition using HMM with MFCC: an analysis using frequency spectral decomposition technique," Signal & Image Processing: An International Journal (SIPIJ), vol. 1, no. 2, pp. 101-110, 2010. https://doi.org/10.5121/sipij.2010.1209
  11. M. Chandrasekar and M. Ponnavaikko, "Tamil speech recognition: a complete model," Electronic Journal Technical Acoustics, article no. 20, 2008. http://www.ejta.org/en/chandrasekar2.
  12. S. Rojathai and M. Venkatesulu, "A novel speech recognition system for Tamil word recognition based on MFCC and FFBNN," European Journal of Scientific Research, vol. 85, no. 4, pp. 578-590, 2012.
  13. A. N. Sigappi and S. Palanivel, "Spoken word recognition strategy for Tamil language," International Journal of Computer Science Issues, vol. 9, no. 1, pp. 1694-0814, 2012.
  14. P. Sivaraj and M. Rama, "Recognition of isolated spoken words using DWT," International Journal of Engineering & Science Research, vol. 2, no. 9, pp. 1187-1196, 2012.
  15. R. Thangarajan, A. M. Natarajan, and M. Selvam, "Word and triphone based approaches in continuous speech recognition for Tamil language," WSEAS Transactions on Signal Processing, vol. 4, no. 3, pp. 76-86, 2008.
  16. S. Saraswathi and T. V. Geetha, "Design of language models at various phases of Tamil speech recognition system," International Journal of Engineering, Science and Technology, vol. 2, no. 5, pp. 244-257, 2010.
  17. S. Karpagavalli, K. U. Rani, R. Deepika, and P. Kokila, "Isolated Tamil digits speech recognition using vector quantization," International Journal of Engineering Research and Technology, vol. 1, no. 4, pp. 1-12, 2012. https://doi.org/10.15623/ijret.2012.0101001
  18. P. Iswarya and V. Radha, "Speech based query processing architecture for Tamil-English in cross language text retrieval system," International Journal of Emerging Trends in Engineering and Development, vol. 7, no. 2, pp.437-442, 2012.
  19. D. L. Donoho and I. M. Johnstone, "Minimax estimation via wavelet shrinkage," Annals of Statistics, vol. 26, no. 3, pp. 879-921, 1998. https://doi.org/10.1214/aos/1024691081
  20. P. Iswarya and V. Radha, "Comparative analysis of feature extraction techniques for Tamil speech recognition," in Proceedings of International Conference on Emerging Research in Computing, Information, Communication and Application, Yelahanka, India, 2013, pp. 755-761.
  21. A. Ekbal and S. Saha, "Simulated annealing based classifier ensemble techniques: Application to part of speech tagging," Information Fusion, vol. 14, no. 3, pp. 288-300, 2013. https://doi.org/10.1016/j.inffus.2012.06.002
  22. A. R. Ahmad, M. Khalid, and R. Yusof, "Machine learning using support vector machines," Centre for Artificial Intelligence and Robotics, Kuala Lumpur, Malaysia, 2002.
  23. A. Ben-Hur and J. Weston, "A user's guide to support vector machines," 2007; http://pyml.sourceforge.net/doc/howto.pdf.
  24. H. C. Kim, S. Pang, H. M. Je, D. Kim, and S. Y. Bang, "Constructing support vector machine ensemble," Pattern Recognition, vol. 36, no. 12, pp. 2757-2767, 2003. https://doi.org/10.1016/S0031-3203(03)00175-4
  25. K. M. P. Sampath, P. W. D. C. Jayathilake, R. Ramanan, S. Fernando, and S. Chatura De Silva, "Speech recognition using neural networks," 2003; http://docslide.us/documents/speech-recognition-using-neuralnetwork.html.