Harmonic and Percussive Separation Based on NMF and Tonality Mask

  • Choi, Keunwoo (Broadcasting & Telecommunications Convergence Research Laboratory, ETRI) ;
  • Chon, Sang Bae (DMC Research Center, Samsung Electronics) ;
  • Kang, Kyeongok (Broadcasting & Telecommunications Convergence Research Laboratory, ETRI)
  • Received : 2012.03.07
  • Accepted : 2012.08.27
  • Published : 2012.12.31


In this letter, we present a new algorithm for the harmonic and percussive separation of jazz music. Using a short-time Fourier transform and nonnegative matrix factorization, the signal is decomposed into rank components. Each component is then split into harmonic and percussive parts using masks calculated based on their tonalities. Finally, the harmonic and percussive parts are separated after applying the masks and a summation. We evaluate the algorithm based on real audio examples using both objective and subjective assessments. The proposed algorithm performs well for the separation of harmonic and percussive parts of jazz excerpts.


Supported by : Ministry of Knowledge Economy


  1. I. Jang, J. Seo, and K. Kang, "File Format Design for Interactive Music Service," ETRI J., vol. 33, no. 1, Feb. 2011, pp. 128-131.
  2. P. Smaragdis and J.C. Brown, "Non-negative Matrix Factorization for Polyphonic Music Transcription," Appl. Signal Process. Audio Acoustics, IEEE Workshop, 2003, pp. 177-180.
  3. M. Helen and T. Virtanen, "Separation of Drums from Polyphonic Music Using Non-negative Matrix Factorization and Support Vector Machine," European Signal Process. Conf., 2005.
  4. M. Kim et al., "Nonnegative Matrix Partial Co-Factorization for Spectral and Temporal Drum Source Separation," IEEE J. Sel. Topics Signal Process., vol. 5, 2011, pp. 1192-1204.
  5. N. Ono et al., "Separation of a Monaural Audio Signal into Harmonic/Percussive Components by Complementary Diffusion on Spectrogram," European Signal Process. Conf., Lausanne, Switzerland, 2008.
  6. D. Fitzgerald, "Harmonic/Percussive Separation Using Median Filtering," Int. Conf. Digital Audio Effects, Graz, Austria, 2010.
  7. D.D. Lee and H.S. Seung, "Algorithms for Non-negative Matrix Factorization," Advances Neural Inf. Process. Syst., vol. 13, 2001.
  8. O. Gillet and G. Richard, "Transcription and Separation of Drum Signals from Polyphonic Music," IEEE Trans. Audio, Speech, Language Process., vol. 16, 2008, pp. 529-540.
  9. K. Brandenburg and J.D. Johnston, "Second Generation Perceptual Audio Coding: The Hybrid Coder," Audio Eng. Soc. (AES) Conv., 1990.
  10. R.G.E. Vincent et al., "BASS-dB: The Blind Audio Source Separation Evaluation Database." Available:
  11. E. Vincent, R. Gribonval, and C. Fevotte, "Performance Measurement in Blind Audio Source Separation," IEEE Trans. Audio, Speech, Language Process., vol. 14, 2006, pp. 1462-1469.
  12. E. Vincent, "Musical Source Separation Using Time-Frequency Source Priors," IEEE Trans. Audio, Speech, Language Process., vol. 14, 2006, pp. 91-98.
  13. International Telecommunication Union, "Methods for the Subjective Assessment of Small Impairments in Audio Systems Including Multi-channel Sound Systems," Rec. ITU-R BS.1116, Geneva, Switzerland, 1994.