Search | Korea Science

Choi, Hyeonsik;Keum, Minseok;Ko, Hanseok
- The Journal of the Acoustical Society of Korea
- /
- v.34 no.3
- /
- pp.234-239
- /
- 2015
Non-Negative Matrix Factorization (NMF) is a method for updating dictionary and gain in alternating manner. Due to ease of implementation and intuitive interpretation, NMF is widely used to detect and separate overlapping sound events. However, NMF that utilizes non-negativity constraints generates parts-based representation and this distinct property leads to a dictionary containing fragmented acoustic events. As a result, the presence of shared basis results in performance degradation in both separation and detection tasks of overlapping sound events. In this paper, we propose a new method that utilizes K-Singular Value Decomposition (K-SVD) based dictionary to address and mitigate the part-based representation issue during the dictionary learning step. Subsequently, we calculate the gain using NMF in sound event detection step. We evaluate and confirm that overlapping sound event detection performance of the proposed method is better than the conventional method that utilizes NMF based dictionary.
https://doi.org/10.7776/ASK.2015.34.3.234 인용 PDF KSCI

Kim, Hyoung-Gook;Kim, Jin Young
- ETRI Journal
- /
- v.39 no.6
- /
- pp.832-840
- /
- 2017
Recently, deep recurrent neural networks have achieved great success in various machine learning tasks, and have also been applied for sound event detection. The detection of temporally overlapping sound events in realistic environments is much more challenging than in monophonic detection problems. In this paper, we present an approach to improve the accuracy of polyphonic sound event detection in multichannel audio based on gated recurrent neural networks in combination with auditory spectral features. In the proposed method, human hearing perception-based spatial and spectral-domain noise-reduced harmonic features are extracted from multichannel audio and used as high-resolution spectral inputs to train gated recurrent neural networks. This provides a fast and stable convergence rate compared to long short-term memory recurrent neural networks. Our evaluation reveals that the proposed method outperforms the conventional approaches.
https://doi.org/10.4218/etrij.17.0117.0157 인용 PDF KSCI