Voice Activity Detection with Run-Ratio Parameter Derived from Runs Test Statistic

  • Oh, Kwang-Cheol (Human and Computer Interaction Lab., Samsung Advanced Institute of Technology)
  • Published : 2003.03.01

Abstract

This paper describes a new parameter for voice activity detection which serves as a front-end part for automatic speech recognition systems. The new parameter called run-ratio is derived from the runs test statistic which is used in the statistical test for randomness of a given sequence. The run-ratio parameter has the property that the values of the parameter for the random sequence are about 1. To apply the run-ratio parameter into the voice activity detection method, it is assumed that the samples of an inputted audio signal should be converted to binary sequences of positive and negative values. Then, the silence region in the audio signal can be regarded as random sequences so that their values of the run-ratio would be about 1. The run-ratio for the voiced region has far lower values than 1 and for fricative sounds higher values than 1. Therefore, the parameter can discriminate speech signals from the background sounds by using the newly derived run-ratio parameter. The proposed voice activity detector outperformed the conventional energy-based detector in the sense of error mean and variance, small deviation from true speech boundaries, and low chance of missing real utterances

Keywords