Search | Korea Science

Automatic melody extraction algorithm using a convolutional neural network

Lee, Jongseol;Jang, Dalwon;Yoon, Kyoungro
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.11 no.12
- /
- pp.6038-6053
- /
- 2017
In this study, we propose an automatic melody extraction algorithm using deep learning. In this algorithm, feature images, generated using the energy of frequency band, are extracted from polyphonic audio files and a deep learning technique, a convolutional neural network (CNN), is applied on the feature images. In the training data, a short frame of polyphonic music is labeled as a musical note and a classifier based on CNN is learned in order to determine a pitch value of a short frame of audio signal. We want to build a novel structure of melody extraction, thus the proposed algorithm has a simple structure and instead of using various signal processing techniques for melody extraction, we use only a CNN to find a melody from a polyphonic audio. Despite of simple structure, the promising results are obtained in the experiments. Compared with state-of-the-art algorithms, the proposed algorithm did not give the best result, but comparable results were obtained and we believe they could be improved with the appropriate training data. In this paper, melody extraction and the proposed algorithm are introduced first, and the proposed algorithm is then further explained in detail. Finally, we present our experiment and the comparison of results follows.
https://doi.org/10.3837/tiis.2017.12.019 인용 PDF KSCI

Improved Melody Recognition Performance of a Cochlear Implant Speech Processing Strategy Using Instantaneous Frequency Encoding Based on Teager Energy Operator

Choi, Sung-Jin;Ryu, Sang-Baek;Kim, Kyung-Hwan
- Journal of Biomedical Engineering Research
- /
- v.31 no.6
- /
- pp.417-426
- /
- 2010
We present a speech processing strategy incorporating instantaneous frequency (IF) encoding for the enhancement of melody recognition performance of cochlear implants. For the IF extraction from incoming sound, we propose the use of a Teager energy operator (TEO), which is advantageous for its lower computational load. From time-frequency analysis, we verified that the TEO-based method provides proper IF encoding of input sound, which is crucial for melody recognition. Similar benefit could be obtained also from the use of a Hilbert transform (HT), but much higher computational cost was required. The melody recognition performance of the proposed speech processing strategy was compared with those of a conventional strategy using envelope extraction, and the HT-based IF encoding. Hearing tests on normal subjects were performed using acoustic simulation and a musical contour identification task. Insignificant difference in melody recognition performance was observed between the TEO-based and HT-based IF encodings, and both were superior to the conventional strategy. However, the TEO-based strategy was advantageous considering that it was approximately 35% faster than the HT-based strategy.
https://doi.org/10.9718/JBER.2010.31.6.417 인용 PDF KSCI

Extracting Predominant Melody from Polyphonic Music using Harmonic Structure (하모닉 구조를 이용한 다성 음악의 주요 멜로디 검출)

Yoon, Jea-Yul;Lee, Seok-Pil;Seo, Kyeung-Hak;Park, Ho-Chong
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.47 no.5
- /
- pp.109-116
- /
- 2010
In this paper, we propose a method for extracting predominant melody of polyphonic music based on harmonic structure. Since polyphonic music contains multiple sound sources, the process of melody detection consists of extraction of multiple fundamental frequencies and determination of predominant melody using those fundamental frequencies. Harmonic structure is an important feature parameter of monophonic signal that has spectral peaks at the integer multiples of its fundamental frequency. We extract all fundamental frequency candidates contained in the polyphonic signal by verifying the required condition of harmonic structure. Then, we combine those harmonic peaks corresponding to each extracted fundamental frequency and assign a rank to each after calculating its harmonic average energy. We finally run pitch tracking based on the rank of extracted fundamental frequency and continuity of fundamental frequency, and determine the predominant melody. We measure the performance of proposed method using ADC 2004 DB and 100 Korean pop songs in terms of MIREX 2005 evaluation metrics, and pitch accuracy of 90.42% is obtained.
PDF KSCI

Extracting Melodies from Polyphonic Piano Solo Music Based on Patterns of Music Structure (음악 구조의 패턴에 기반을 둔 다음(Polyphonic) 피아노 솔로 음악으로부터의 멜로디 추출)

Choi, Yoon-Jae;Lee, Ho-Dong;Lee, Ho-Joon;Park, Jong C.
- 한국HCI학회:학술대회논문집
- /
- 2009.02a
- /
- pp.725-732
- /
- 2009
Thanks to the development of the Internet, people can easily access a vast amount of music. This brings attention to application systems such as a melody-based music search service or music recommendation service. Extracting melodies from music is a crucial process to provide such services. This paper introduces a novel algorithm that can extract melodies from piano music. Since piano can produce polyphonic music, we expect that by studying melody extraction from piano music, we can help extract melodies from general polyphonic music.
PDF

Extraction and Indexing Representative Melodies Considering Musical Composition Forms for Content-based Music Information Retrievals (내용 기반 음악 정보 검색을 위한 음악 구성 형식을 고려한 대표 선율의 추출 및 색인)

Ku, Kyong-I;Lim, Sang-Hyuk;Lee, Jae-Heon;Kim, Yoo-Sung
- The KIPS Transactions:PartD
- /
- v.11D no.3
- /
- pp.495-508
- /
- 2004
Recently, in content-based music information retrieval systems, to enhance the response time of retrieving music data from large music database, some researches have adopted the indexing mechanism that extracts and indexes the representative melodies. The representative melody of music data must stand for the music itself and have strong possibility to use as users' input queries. However, since the previous researches have not considered the musical composition forms, they are not able to correctly catch the contrast, repetition and variation of motif in musical forms. In this paper, we use an index automatically constructed from representative melodies such like first melody, climax melodies and similarly repeated theme melodies. At first, we expand the clustering algorithm in order to extract similarly repeated theme melodies based on the musical composition forms. If the first melody and climax melodies are not included into the representative melodies of music by the clustering algorithm, we add them into representative melodies. We implemented a prototype system and did experiments on comparison the representative melody index with other melody indexes. Since, we are able to construct the representative melody index with the lower storage by 34％ than whole melody index, the response time can be decreased. Also, since we include first melody and climax melody which have the strong possibility to use as users' input query into representative melodies, we are able to get the more correct results against the various users' input queries than theme melody index with the cost of storage overhead of 20％.
https://doi.org/10.3745/KIPSTD.2004.11D.3.495 인용 PDF KSCI

Implementation of the System Converting Image into Music Signals based on Intentional Synesthesia (의도적인 공감각 기반 영상-음악 변환 시스템 구현)

Bae, Myung-Jin;Kim, Sung-Ill
- Journal of IKEEE
- /
- v.24 no.1
- /
- pp.254-259
- /
- 2020
This paper is the implementation of the conversion system from image to music based on intentional synesthesia. The input image based on color, texture, and shape was converted into melodies, harmonies and rhythms of music, respectively. Depending on the histogram of colors, the melody can be selected and obtained probabilistically to form the melody. The texture in the image expressed harmony and minor key with 7 characteristics of GLCM, a statistical texture feature extraction method. Finally, the shape of the image was extracted from the edge image, and using Hough Transform, a frequency component analysis, the line components were detected to produce music by selecting the rhythm according to the distribution of angles.
https://doi.org/10.7471/ikeee.2020.24.1.254 인용 PDF KSCI

A Similarity Computation Algorithm for Music Retrieval System Based on Query By Humming (허밍 질의 기반 음악 검색 시스템의 유사도 계산 알고리즘)

Oh Dong-Yeol;Oh Hae-Seok
- Journal of the Korea Society of Computer and Information
- /
- v.11 no.4 s.42
- /
- pp.137-145
- /
- 2006
A user remembers a melody as not the combination of pitch and duration which is written in score but the contour which is composed of the relative pitch and duration. Because of the way of remembering a melody the previous Music Information Retrieval Systems which uses keyboard Playing or score as the main input melody are not easily acceptable in Query By Humming Systems. In this paper, we mention about the considerable checkpoints for Query By Humming System and previous researches. And we propose the feature extraction which is similar with the way of remembering a melody and similarity computation algorithms between melody in humming and melody in music. The proposed similarity computation algorithms solves the problem which can be happened when only uses the relative pitches by using relative durations.
PDF

Indexing and Retrieval Mechanism using Variation Patterns of Theme Melodies in Content-based Music Information Retrievals (내용 기반 음악 정보 검색에서 주제 선율의 변화 패턴을 이용한 색인 및 검색 기법)

구경이;신창환;김유성
- Journal of KIISE:Databases
- /
- v.30 no.5
- /
- pp.507-520
- /
- 2003
In this paper, an automatic construction method of theme melody index for large music database and an associative content-based music retrieval mechanism in which the constructed theme melody index is mainly used to improve the users' response time are proposed. First, the system automatically extracted the theme melody from a music file by the graphical clustering algorithm based on the similarities between motifs of the music. To place an extracted theme melody into the metric space of M-tree, we chose the average length variation and the average pitch variation of the theme melody as the major features. Moreover, we added the pitch signature and length signature which summarize the pitch variation pattern and the length variation pattern of a theme melody, respectively, to increase the precision of retrieval results. We also proposed the associative content-based music retrieval mechanism in which the k-nearest neighborhood searching and the range searching algorithms of M-tree are used to select the similar melodies to user's query melody from the theme melody index. To improve the users' satisfaction, the proposed retrieval mechanism includes ranking and user's relevance feedback functions. Also, we implemented the proposed mechanisms as the essential components of content-based music retrieval systems to verify the usefulness.
PDF KSCI

Development of Audio Melody Extraction and Matching Engine for MIREX 2011 tasks

Song, Chai-Jong;Jang, Dalwon;Lee, Seok-Pil;Park, Hochong
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2012.07a
- /
- pp.164-166
- /
- 2012
In this paper, we proposed a method for extracting predominant melody of polyphonic music based on harmonic structure. Harmonic structure is an important feature parameter of monophonic signal that has spectral peaks at the integer multiples of its fundamental frequency. We extract all fundamental frequency candidates contained in the polyphonic signal by verifying the required condition of harmonic structure. Then, we combine those harmonic peaks corresponding to each extracted fundamental frequency and assign a rank to each after calculating its harmonic average energy. We run pitch tracking based on the rank of extracted fundamental frequency and continuity of fundamental frequency, and determine the predominant melody. For the query by singing/humming (QbSH) task, we proposed Dynamic Time Warping (DTW) based matching engine. Our system reduces false alarm by combining the distances of multiple DTW processes. To improve the performance, we introduced the asymmetric sense, pitch level compensation, and distance intransitiveness to DTW algorithm.
PDF

Implementation of a Tone Correction System Through a Visualization of Melody Comparison (멜로디 비교 시각화를 통한 음정 교정 시스템 구현)

Lee, Hye-In;Park, Ju-Hyun;Lee, Seok-Pil
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.63 no.1
- /
- pp.156-161
- /
- 2014
With the proliferation of digital music, public's interest in music and desire to sing well are increasing. This paper presents the implementation of a tone correction system through a visualization of comparison between music and humming data. For this we extract MIDI note from music and humming data and then design a matching engine using DTW algorithm which is for robust matching results against local timing variation and inaccurate tempo. This system is expected to correct the user's wrong tone by visualization and feedback from the result.
https://doi.org/10.5370/KIEE.2014.63.1.156 인용 PDF KSCI KPUBS HTML

Search Result 17, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)