Separation of Single Channel Mixture Using Time-domain Basis Functions

  • Jang, Gil-Jin (Department of Computer Science, Korea Advanced Institute of Science and Technology) ;
  • Oh, Yung-Hwan (Department of Computer Science, Korea Advanced Institute of Science and Technology)
  • Published : 2002.12.01

Abstract

We present a new technique for achieving source separation when given only a single charmel recording. The main idea is based on exploiting the inherent time structure of sound sources by learning a priori sets of time-domain basis functions that encode the sources in a statistically efficient manner. We derive a learning algorithm using a maximum likelihood approach given the observed single charmel data and sets of basis functions. For each time point we infer the source parameters and their contribution factors. This inference is possible due to the prior knowledge of the basis functions and the associated coefficient densities. A flexible model for density estimation allows accurate modeling of the observation, and our experimental results exhibit a high level of separation performance for simulated mixtures as well as real environment recordings employing mixtures of two different sources. We show separation results of two music signals as well as the separation of two voice signals.

Keywords

References

  1. G. J. Brown and M. Cooke, 'Computational auditory scene analysis,' Computer Speech and Language, 8 (4), 297-336, 1994 https://doi.org/10.1006/csla.1994.1016
  2. P. Comon, 'Independent component analysis, A new concept?,' Signal Processing, 36, 287-314, 1994 https://doi.org/10.1016/0165-1684(94)90029-9
  3. H. G. Okuno, T. Nakatani, and T. Kawabata, 'Listening to two simultaneous speeches,' Speech Communications, 27, 299-310, 1999 https://doi.org/10.1016/S0167-6393(98)00080-6
  4. D. L. Wang and G. J. Brown, 'Separation of speech from interfering sounds based on oscillatory correlation,' IEEE Trans. on Neural Networks, 10, 684-697, 1999 https://doi.org/10.1109/72.761727
  5. S. T. Roweis, 'One microphone source separation,' Advances in Neural Information Processing Systems, 13, 793-799, 2001
  6. T.-W. Lee and G.-J. Jang, 'The statistical structures of male and female speech signals,' in Proc. ICASSP, (Salt Lake City, Utah), May 2001
  7. A. J. Bell and T. J. Sejnowski, 'Learning the higher-order structures of a natural sound,' Network: Computation in Neural Systems, 261-266, Jul. 1996
  8. S. Choi, A. Cichocki, and S. Amari, 'Flexible independent component analysis,' Journal of VLSI Signal Processing, 26 (1-2), 25-38, 2000 https://doi.org/10.1023/A:1008135131269
  9. J.-F. Cardoso, 'Infomax and maximum likelihood for blind source separation,' IEEE Signal Processing Letters, 4, 112-114, Apr. 1997 https://doi.org/10.1109/97.566704
  10. B. Pearlmutter and L. Parra, 'A context-sensitive generalization of ICA,' in Proc. ICONIP, (Hong Kong), 151-157, Sept. 1996
  11. T.-W. Lee and M. S. Lewicki, 'The generalized gaussian mixture model using ICA,' in Proc. International Workshop on Independent Component Analysis (ICA'OO), (Helsinki), 239-244, Jun. 2000
  12. S. Amari and J.-F. Cardoso, 'Blind source separation - semiparametric statistical approach,' IEEE Trans. on Signal Proc., 45(11), 2692-2700, 1997 https://doi.org/10.1109/78.650095
  13. D. T. Pham, P. Garrat, and C. Jutten, 'Separation of a mixture of independent sources through a maximum likelihood approach,' In Proc. EUSIPCO, 771-774, 1992
  14. A. J. Bell and T. J. Sejnowski, 'An information-maximization approach to blind separation and blind deconvolution,' Neural Computation, 7, 1129-1159, 1995 https://doi.org/10.1162/neco.1995.7.6.1129