A Comparative Analysis of Content-based Music Retrieval Systems

Ro, Jung-Soon;

doi:10.3743/KOSIM.2013.30.3.023

Journal of the Korean Society for information Management (정보관리학회지)

Volume 30 Issue 3
/
Pages.23-48
/
2013
/
1013-0799(pISSN)
/
2586-2073(eISSN)

Korean Society for Information Management (한국정보관리학회)

DOI QR Code

A Comparative Analysis of Content-based Music Retrieval Systems

내용기반 음악검색 시스템의 비교 분석

Ro, Jung-Soon

노정순 (한남대학교 문과대학 문헌정보학과)

Received : 2013.08.01
Accepted : 2013.09.10
Published : 2013.09.30

https://doi.org/10.3743/KOSIM.2013.30.3.023 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

This study compared and analyzed 15 CBMR (Content-based Music Retrieval) systems accessible on the web in terms of DB size and type, query type, access point, input and output type, and search functions, with reviewing features of music information and techniques used for transforming or transcribing of music sources, extracting and segmenting melodies, extracting and indexing features of music, and matching algorithms for CBMR systems. Application of text information retrieval techniques such as inverted indexing, N-gram indexing, Boolean search, truncation, keyword and phrase search, normalization, filtering, browsing, exact matching, similarity measure using edit distance, sorting, etc. to enhancing the CBMR; effort for increasing DB size and usability; and problems in extracting melodies, deleting stop notes in queries, and using solfege as pitch information were found as the results of analysis.

본 연구는 웹에서 접근 가능한 내용기반 음악검색(CBMR) 시스템들을 조사하여, 탐색질의의 종류, 접근점, 입출력, 탐색기능, 데이터베이스 성격과 크기 등의 관점에서 특성을 비교 분석하고자 하였다. 비교 분석에 사용된 특성을 추출하기 위해 내용기반 음악정보의 특성과 시스템 구축에 필요한 파일의 변환, 멜로디 추출 및 분할, 색인자질 추출과 색인, 매칭에 사용되는 기술들을 선행연구로 리뷰하였다. 15개의 시스템을 분석한 결과 다음과 같은 특성과 문제점이 분석되었다. 첫째, 도치색인, N-gram 색인, 불리언 탐색, 용어절단검색, 키워드 및 어구 탐색, 음길이 정규화, 필터링, 브라우징, 편집거리, 정렬과 같은 텍스트 정보 검색 기법이 CBMR에서도 검색성능을 향상시키는 도구로 사용되고 있었다. 둘째, 시스템들은 웹에서 크롤링하거나 탐색질의를 DB에 추가하는 등으로 DB의 성장과 실용성을 위한 노력을 하고 있었다. 셋째, 개선되어야 할 문제점으로 선율이나 주선율을 추출하는데 부정확성, 색인자질을 추출할 때 사용되는 불용음(stop notes)을 탐색질의에서도 자동 제거할 필요성, 옥타브를 무시한 solfege 검색의 문제점 등이 분석되었다.

Keywords

References

구경이, 임상혁, 이재헌, 김유성 (2003). 주제 선율 색인을 이용한 내용 기반 음악정보 검색 시스템. 데이터베이스 연구, 19(3), 34-45.(Ku, Kyong-I, Lim, Sang-Hyuk, Lee, Jae-Heon, & Kim, Yoo-Sung (2003). A content-based music information retrieval system using theme melody index. Database Research, 19(3), 34-45.)
김무정, 낭종호 (2011). Query By Humming 응용을 위한 Midi 파일에서의 자동 멜로디 트랙 선택 방법. 한국정보과학회 한국컴퓨 터종합학술대회 논문집, 38(1B), 405-408.)Kim, Moojung, & Nang, Jongho (2011). An automative melody track selection in MIDI files for query by humming(QBH) application, Proceedings of Conference of the Korea Information Science Society, 38(1B), 405-408.)
노정순 (2011). 정보검색: 이론과 실제. 대전: 글누리.(Ro, Jung-Soon (2011). Information retrieval: Theory and practice. Daejeon: Geulnuri.)
박만수, 김회린 (2006). 실제 잡음 환경에 강인한 오디오 핑거프린팅 기법. Telecommunications Review (SK Telecom), 16(3), 435-446.(Park, Mansoo, & Kim, Hoirin (2006). An audio fingerprinting scheme robust to real-noise environments. Telecommunications Review (SK Telecom), 16(3), 435-446.)
유진희, 박상현 (2007). 허밍 질의 처리 시스템의 성능 향상을 위한 효율적인 빈번 멜로디 인덱싱 방법. 정보과학회논문지: 데이터베이스, 34(4), 283-303.(You, Jinhee, & Park, Sanghyun (2007). An efficient frequent melody indexing method to improve performance of query-by-humming system. Journal of KISS: Database, 34(4), 283-303.)
최윤재, 박종철 (2009). 음악의 특성에 따른 피아노 솔로 음악으로부터의 멜로디 추출. 정보과학회논문지: 컴퓨팅의 실제 및 레터, 15(12), 923-927.(Choi, Yoonjae, & Park, Jong C. (2009). Extracting melodies from piano solo music based on its characteristics. Journal of KISS(C): Computing Practices and Letters, 15(12), 923-927.)
Arifi, V., Clausen, M., Kurth, F., & Muller, M. (2003). Automatic synchronization of music data in score-, MIDI-, and PCM-format. Proceedings of ISMIR 2003. Retrieved from http://ismir2003.ismir.net/papers/Arifi.pdf
Bainbridge, D. (2004). Music information retrieval research and its context at the University of Waikato. Journal of the American Society for Information Science and Technology, 55(12), 1092-1099. http://dx.doi.org/10.1002/asi.20062
Bandera, C. de la, Barbancho, A. M., Tardon, L. J., Sammartino, S., & Barbancho, I. (2011). Humming method for content-based music information retrieval. Proceedings of ISMIR 2011, 49-54.
Cano, P., Batlle, E., Kalker, T., & Haitsma, J. (2005). A review of audio fingerprinting. Journal of VLSI Signal Processing, 41, 271-284. http://dx.doi.org/10.1007/s11265-005-4151-3
Cartwright, M. B., Rafii, Z., Han, J. Y., & Pardo, B. (2011). Making searchable melodies: Human versus machine. Proceedings of Human Computation, 2011. Retrieved from http://www.cs.northwestern.edu/-jha222/paper/2011_humancomp_cartwright_etal.pdf
Chai, W., & Vercoe, B. (2002). Melody retrieval on the web. Proceedings of ACM/SPIE Conference on Multimedia Computing and Networking, 226. http://dx.doi.org/10.1117/12.449982
Chandrasekhar, V., Sharifi, M., & Ross, D. A. (2011). Survey and evaluation of audio fingerprinting schemes for mobile query-by-example applications. Proceedings of ISMIR 2011, 801-806.
Chen, R., Shen, W., Srinivasamurthy, A., & Chordia, P. (2012). Chord recognition using durationexplicit hidden Markov models. Proceedings of ISMIR 2012, 445-450.
Cheng, H. T., Yang, Y. H., Lin, Y. C., Liao, I B., & Chen, H. H. (2008). Automatic chord recognition for music classification and retrieval. Proceedings of the IEEE International Conference on Multimedia and Expo 2008, 1505-1508. http://dx.doi.org/10.1109/ICME.2008.4607732
Chew, E., Georgiou, P., & Narayanan, S. (2008). Challenging uncertainty in query by humming systems: A fingerprinting approach. IEEE Transactions on Audio, Speech, and Language Processing, 16(2), 359-371. http://dx.doi.org/10.1109/TASL.2007.912373
Cilibrasi, R., Vitanyi, P., & Wolf, R. (2004). Algorithmic clustering of music based on string compression. Computer Music Journal, 28(4), 49-67.
Dannenberg, R., Birmingham, W. P., Hu, N., Meek, C., Pardo, B., & Tzanetakis, G. (2007). A Comparative evaluation of search techniques for query by humming using the MUSART testbed. JASIST, 58(5), 587-701.
David, G. (2003). Pitch extraction and fundamental frequency: history and current techniques. Technical report TR-CS/2003-06. Retrieved from http://audio-fingerprint.googlecode.com/svn-history/r62/trunk/referencias/2003-06.pdf
Duggan, B., O'Shea, B., Gainza, M., & Cunningham, P. (2009). Compensating for expressiveness in queries to a content based music information retrieval system. Proceedings of the International Computer Music Conference (ICMC 2009), 33-36.
Doraisamy, S., & Ruger, S. (2002). Robust polyphonic music retrieval with n-grams. Journal of Intelligent Information Systems, 21(1), 53-70. http://dx.doi.org/10.1023/A:1023553801115
Downie, S. (1999). Evaluating a simple approach to music information retrieval: Conceiving melodic N-grams as text. Unpublished doctoral dissertation. Univ. of Western Ontario. USA.
Ghias, A., Logan, J., Chamberlin, D., & Smith, B. (1995). Query by humming: Musical information retrieval in an audio database. Proceedings of the 3rd Annual ACM International Conference on Multimedia, 231-236.
Goto, M. (2004). A real-time music-scene-description system: Predominant-F0 estimation for detecting melody and bass lines in real-world audio signals. Speech Communication, 43(4), 311-329. http://dx.doi.org/10.1016/j.specom.2004.07.001
Hanna, P., Ferraro, P., & Robine, M. (2007). On optimizing the editing algorithms for evaluating similarity between monophonic musical sequences. Journal of New Music Research, 36(4), 267-279. http://dx.doi.org/10.1080/09298210801927861
Hug, A., Cartwright, M., & Pardo, B. (2010). Crowdsourcing a real-world on-line query by humming system. Proceedings of the 7th Sound and Music Computing Conference, 2010, Barcelona, Spain. Retrieved from http://music.eecs.northwestern.edu/publications/smc2010-huq-cartwright-pardo.pdf
Kan, M., Wang, Y., Iskandar, D., Nwe, T. L., & Shenoy, A. (2008). LyricAlly: Automatic synchronization of textual lyrics to acoustic music signals. IEEE Transaction on Audio, Speech, and Language Processing, 16(2), 338-349. http://dx.doi.org/10.1109/TASL.2007.911559
Kline, R. L., & Glinert, E. P. (2003). Approximate matching algorithms for music information retrieval using vocal input. Proceedings of the Eleventh ACM International Conference on Multimedia 2003, 130-139. http://dx.doi.org/10.1145/957013.957042
Kornstadt, A. (1998). Themefinder: A web-based melodic search tool. Computing in Musicology, 11, 231-236.
Lee, K., & Slaney, M. (2008). Acoustic chord transcription and key extraction from audio using key-dependent HMMs trained on synthesized audio. IEEE Transactions on Audio, Speech, and Language Processing, 26(2), 291-301. http://dx.doi.org/10.1109/TASL.2007.914399
Lee, Y. J., & Moon, S. B. (2006). A user study on information searching behaviors for designing user-centered query interface of content-based music information retrieval system. Journal of the Korean Society for Information Management, 23(2), 5-19. http://dx.doi.org/10.3743/KOSIM.2006.23.2.005
Lemstrom, K., & Pienimaki, A. (2007). On comparing edit distance and geometric frameworks in content-based retrieval of symbolically encoded polyphonic music. Musicae Scientiae, Discussion Forum 4a, 135-152.
Lemstrom, K., & Tarhio, J. (2003). Transposition invariant pattern matching for multi-track strings. Nordic Journal of Computing, 10, 185-205.
McNab, R. J., Smith, L A., Witten I. H., & Cunningham, S. J. (1996). Towards the digital music library: tune retrieval from acoustic input. Proceedings of the 1st ACM International Conference on Digital Libraries, 11-18.
McNab, R. J., Smith, L A., Bainbridge, D., & Witten, I. H. (1997). The New Zealand Digital Library MELody inDEX. D-Lib Magazine, 3(5), 4-15.
Melucci, M., & Orio, N. (2004). Combining melody processing and information retrieval techniques: Methodology, evaluation and system implementation. Journal of the American Society for Information Science and Technology, 55(12), 1058-1066. http://dx.doi.org/10.1002/asi.20058
Mesaros, A., & Virtanen, T. (2010). Automatic recognition of lyrics in singing. EURASIP Journal on Audio, Speech, and Music Processing, 2010:546047. http://dx.doi.org/10.1155/2010/546047
Nam, G. P., Park, K. R., Park, S., Lee, S., & Kim, M. (2012) A new query-by-humming system based on the score level fusion of two classifiers. International Journal of Communication Systems, 25(6), 717-733. http://dx.doi.org/10.1002/dac.1187
Papadopoulos, H., & Tzanetakis, G. (2012). Modeling chord and key structure with Markov logic. Proceedings of ISMIR 2012, 127-132.
Pardo, B., Shiffrin, J., & Birmingham, W. (2004). Name that tune: A pilot study in finding a melody from a sung query. Journal of the American Society for Information Science and Technology, 55(4), 283-300. http://dx.doi.org/10.1002/asi.10373
Prechelt, M., & Typke, R. (2001). An interface for melody input. ACM Transactions on Computer- Human Interaction, 6(2), 133-149. http://dx.doi.org/10.1145/376929.376978
Rho, S., Han, B., Hwang, E., & Kim, M. (2008). MUSEMBLE: A novel music retrieval system with automatic voice query transcription and reformulation. The Journal of Systems and Software, 81(7), 1065-1080. http://dx.doi.org/10.1016/j.jss.2007.05.038
Roger, B., Dannenberg, R., & Hu, N. (2004). Understanding search performance in query-by humming systems. Proceedings of ISMIR 2004, 85-89.
Sheh, A., & Ellis, D. P. (2003). Chord segmentation and recognition using EM-trained hidden Markov models. Proceedings of ISMIR 2003. Retrieved from http://ismir2003.ismir.net/papers/Sheh.PDF
Shiffrin, J., Pardo, B., & Birmingham, W. (2002). HMM-based musical query retrieval. Proceedings of the 2nd ACM/IEEE-CS Joint Conference on Digital Libraries, 295-300.
Tripathy, A., Chhaatre, N., Surendranath, N., & Kalsi, M. (2009). Query by humming system. International Journal of Recent Trends in Engineering, 2(5), 373-379.
Turetsky, R. J., & Ellis, D. P. W. (2003). Ground truth transcriptions of real music from forcealligned MIDI syntheses. Proceeding of ISMIR 2003, 445-448.
Typke, R. (2007). Music retrieval based on melodic similarity. Unpublished doctoral dissertation. Universiteit Utrecht. Nederlands.
Typke, R., Veltkamp, R. C., & Wiering, F. (2004). Searching notated polyphonic music using transportation distances. Proceedings of the 12th Annual ACM International Conference on Multimedia, 128-135.
Typke, R., Wiering, F., & Veltkamp, R. C. (2005). A survey of music information retrieval systems. Proceedings of ISMIR 2005, 153-160.
Viro, V. (2011). Peachnote: Music score search and analysis platform. Proceedings of ISMIR 2011, 359-362.
Wan, C., & Liu, M. (2006). Content-based audio retrieval with relevance feedback. Pattern Recognition Letters, 27(2), 85-92. http://dx.doi.org/10.1016/j.patrec.2005.07.005
Wang, A. (2003). An industrial-strength audio search algorithm. Proceedings of the 4th International Conference on Music Information Retrieval. http://www.ee.columbia.edu/-dpwe/papers/Wang03-shazam.pdf
Wang, A. (2006). The Shazam music recognition service. Communications of the ACM, 49(8), 44-48. http://dx.doi.org/10.1145/1145287.1145312
Wang, C., Li, J., & Shi, S. (2006). N-gram inverted index structures on music data for theme mining and content-based information retrieval. Pattern Recognition Letters, 27(5), 492-503. http://dx.doi.org/10.1016/j.patrec.2005.09.012
Wold, E., Keislar, B. D., & Wheaton, J. (1996). Content-based classification, search, and retrieval of audio. IEEE Multimedia, 3(3), 27-36. https://doi.org/10.1109/93.556537
Zhu, Y., & Shasha, D. (2003). Warping indexes with envelops transforms for query by humming. Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD '03), 181-192. http://dx.doi.org/10.1145/872757.872780

Cited by

Highlight based Lyrics Search Considering the Characteristics of Query vol.26, pp.4, 2016, https://doi.org/10.5391/JKIIS.2016.26.4.301

Journal of the Korean Society for information Management (정보관리학회지)

A Comparative Analysis of Content-based Music Retrieval Systems

내용기반 음악검색 시스템의 비교 분석

Abstract

Keywords

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)