통합 검색 | Korea Science

Automatic melody extraction algorithm using a convolutional neural network

Lee, Jongseol;Jang, Dalwon;Yoon, Kyoungro
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- 제11권12호
- /
- pp.6038-6053
- /
- 2017
In this study, we propose an automatic melody extraction algorithm using deep learning. In this algorithm, feature images, generated using the energy of frequency band, are extracted from polyphonic audio files and a deep learning technique, a convolutional neural network (CNN), is applied on the feature images. In the training data, a short frame of polyphonic music is labeled as a musical note and a classifier based on CNN is learned in order to determine a pitch value of a short frame of audio signal. We want to build a novel structure of melody extraction, thus the proposed algorithm has a simple structure and instead of using various signal processing techniques for melody extraction, we use only a CNN to find a melody from a polyphonic audio. Despite of simple structure, the promising results are obtained in the experiments. Compared with state-of-the-art algorithms, the proposed algorithm did not give the best result, but comparable results were obtained and we believe they could be improved with the appropriate training data. In this paper, melody extraction and the proposed algorithm are introduced first, and the proposed algorithm is then further explained in detail. Finally, we present our experiment and the comparison of results follows.
https://doi.org/10.3837/tiis.2017.12.019 인용 PDF KSCI

Treatment Effect of a Modified Melodic Intonation Therapy (MMIT) in Korean Aphasics

Ko, Do-Heung;Jeong, Ok-Ran
- 음성과학
- /
- 제4권2호
- /
- pp.91-102
- /
- 1998
The present study attempted to modify the conventional Melodic Intonation Therapy (MIT) in three aspects: number of syllables of adjacent target utterances (ATU), melody patterns of ATU, and initial listening of melody and intoned speech with the eyes closed. The modified Melodic Intonation Therapy (MMIT) was applied to two severe Korean aphasics. The patients exhibited a severely nonfluent aphasia resulting from a left CVA(Cerebrovascular Accident). The purpose of the modification was to avoid perseveration and improve reflective listening skills. First, the treatment program avoided ATU with the same number of syllables. Second, four different patterns of melody were developed: rising type, falling type, V-type, and inverted V-type. One type of prosodic pattern was preceded and followed by another type of melody. These two variations were to decrease perseverative behaviors. Finally, the patients kept their eyes closed when the clinician played and hummed a target melody at the initial stage of the program in order to improve reflective listening skills. A single-subject alternating treatment design was used. The effects of MMIT were compared to the conventional MIT. Differing the number of syllables and the type of melodic patterns decreased perseverative behaviors and produced more correct names. The initial listening of the target melody with the patients' eyes closed seemed to increase their attentiveness and result in a more fluent production of target utterances. Probable reasons for the effectiveness of MMIT were discussed.
PDF

오디오의 파형과 FFT 분석을 이용한 대표 선율 검색 (Representative Melodies Retrieval using Waveform and FFT Analysis of Audio)

정명범;고일주
- 한국정보과학회논문지:소프트웨어및응용
- /
- 제34권12호
- /
- pp.1037-1044
- /
- 2007
최근 내용 기반 음악 검색 시스템에서는 사용자의 응답 시간을 단축시키기 위해 음악의 대표성을 갖는 선율을 추출하여 색인하고, 검색 시 이를 사용한다. 기존 연구에서는 미디(midi) 데이타를 이용하여 대표 선율을 추출하는 방법이 제안되었으나, 미디 데이타에 한정되는 단점이 있었다. 따라서 본 논문에서는 디지털 신호처리를 이용하여 모든 오디오 파일 포맷에 적용 가능한 대표 선율 검색을 제안한다. 대표 선율 검색을 위해 FFT(Fast Fourier Transform)을 이용하여 박자와 마디를 찾고 각 마디들의 PCM 데이타로부터 높은 수치가 나타나는 빈도를 측정한다. 이때 높은 수치들이 가장 많이 뭉쳐 있는 영역에서 여덟 마디 간격이 오디오 데이타의 대표 선율 영역이다. 제안 방법의 유효성을 검증하기 위한 실험으로 총 1000곡을 선택하여 대표 선율을 추출하였고, 그 결과 템포를 찾아낸 737곡 중 79.5%의 정확성을 보였다.
PDF KSCI

Improved Melody Recognition Performance of a Cochlear Implant Speech Processing Strategy Using Instantaneous Frequency Encoding Based on Teager Energy Operator

Choi, Sung-Jin;Ryu, Sang-Baek;Kim, Kyung-Hwan
- 대한의용생체공학회:의공학회지
- /
- 제31권6호
- /
- pp.417-426
- /
- 2010
We present a speech processing strategy incorporating instantaneous frequency (IF) encoding for the enhancement of melody recognition performance of cochlear implants. For the IF extraction from incoming sound, we propose the use of a Teager energy operator (TEO), which is advantageous for its lower computational load. From time-frequency analysis, we verified that the TEO-based method provides proper IF encoding of input sound, which is crucial for melody recognition. Similar benefit could be obtained also from the use of a Hilbert transform (HT), but much higher computational cost was required. The melody recognition performance of the proposed speech processing strategy was compared with those of a conventional strategy using envelope extraction, and the HT-based IF encoding. Hearing tests on normal subjects were performed using acoustic simulation and a musical contour identification task. Insignificant difference in melody recognition performance was observed between the TEO-based and HT-based IF encodings, and both were superior to the conventional strategy. However, the TEO-based strategy was advantageous considering that it was approximately 35% faster than the HT-based strategy.
https://doi.org/10.9718/JBER.2010.31.6.417 인용 PDF KSCI

기하학적 해싱 기법을 이용한 음악 검색 (Music Retrieval Using the Geometric Hashing Technique)

정효숙;박성빈
- 컴퓨터교육학회논문지
- /
- 제8권5호
- /
- pp.109-118
- /
- 2005
본 논문에서는 음악 데이터베이스의 멜로디와 사용자가 기술한 멜로디의 기하학적 구조를 비교하는 음악 검색 시스템을 제안하고 있다. 시스템은 멜로디의 구조적이고 상황적인 특징들을 분석하여 쿼리 멜로디와 데이터베이스의 멜로디가 일치성을 찾고자 한다. 검색 방법은 사전 처리 단계와 인식 단계로 이루어진 기하하적 해싱 알고리즘에 기반을 두고 있다. 사전 처리 단계 동안 구조적 특징을 찾기 위해서 음악의 멜로디를 여러 개의 프래그먼트(fragment)들로 분할하고 그 프래그먼트의 각 음의 높이 및 길이를 분석한다. 상황적 특징을 찾기 위해서 각 프래그먼트의 중심 화음을 찾는다. 인식 단계 동안 사용자가 입력한 쿼리 멜로디를 여러 개의 프래그먼트들로 분할하고 구조적이고 상황적 특성이 유사한 모든 프래그먼트들을 데이터베이스에서 검색한다. 투표는 각 프래그먼트에 대해 이루어지고 총 득표수가 최대인 음악이 쿼리 멜로디와 일치하는 멜로디를 갖는 음악이 된다. 이러한 접근 방법을 이용하여, 음악 데이터베이스에서 유사한 멜로디를 빠르게 찾을 수 있다. 또한 이 방법은 표절 음악을 감지하는데 적용될 수 있다.
PDF

유전알고리즘 기반의 사용자 파라미터 설정과 코드 진행을 고려한 리듬과 멜로디 자동 작곡 시스템 (An Automatic Rhythm and Melody Composition System Considering User Parameters and Chord Progression Based on a Genetic Algorithm)

정재훈;안창욱
- 정보과학회 논문지
- /
- 제43권2호
- /
- pp.204-211
- /
- 2016
본 논문에서는 주어진 코드 진행에서 비화성음을 활용한 화려한 멜로디를 자동으로 생성하는 새로운 진화적 자동 음악 작곡 시스템을 제안한다. 전체 시스템은 리듬 생성과 멜로디 생성의 두 단계로 나누어지며, 사용자 설정 파라미터로 제어되는 리듬 적합도 평가 함수와 화성학 기반으로 설계된 멜로디 적합도 평가 함수, 그리고 멜로디 최적화 성능 향상을 위해 설계된 음악적 문맥을 고려한 진화연산을 소개한다. 제안하는 리듬 적합도 평가 함수의 최적화에서 표준 유전알고리즘과 엘리티즘이 적용된 유전알고리즘, 차분진화 알고리즘, 그리고 입자군집최적화 알고리즘의 비교 실험을 하였으며, 멜로디 적합도 평가함수 최적화에서 위 4가지 알고리즘과 제안하는 진화연산을 적용한 유전알고리즘과의 비교 실험을 통해 성능을 검증하고, 생성된 멜로디에 대한 음악적 분석을 수행하였다.
https://doi.org/10.5626/JOK.2016.43.2.204 인용 KSCI

허밍을 이용한 고품질 음악 생성 (Humming based High Quality Music Creation)

이윤재;김선민
- 한국소음진동공학회:학술대회논문집
- /
- 한국소음진동공학회 2014년도 추계학술대회 논문집
- /
- pp.146-149
- /
- 2014
In this paper, humming based automatic music creation method is described. It is difficult for the general public which does not have music theory to compose the music in general. However, almost people can make the main melody by a humming. With this motivation, a melody and chord sequences are estimated by the humming analysis. In this paper, humming is generated without a metronome. Then based on the estimated chord sequence, accompaniment is generated using the MIDI template matched to each chord. The 5 Genre is supported in the music creation. The melody transcription is evaluated in terms of onset and pitch estimation accuracy and MOS evaluation is used for created music evaluation.
PDF

허밍 운율정보를 이용한 곡목 검색 기술 (Study on the song title query by humming melody information)

이지연;한민수
- 대한음성학회지:말소리
- /
- 제44호
- /
- pp.131-143
- /
- 2002
Music query by humming is a challenging problem since the humming signal inevitably contains much variation and inaccuracy. In this paper, we suggest an algorithm for querying a wanted song from music database by humming its melody. In order to suit or adapt the inaccurate peoples humming, a new melody representation technique is proposed. Our algorithm is basically a pitch and duration information-based one and performs fairly well. 85% of correct query rate of the song is achieved for the top 3 matches when tested with 20 songs.
PDF

내용 기반 음악 정보 검색에서 주제 선율의 변화 패턴을 이용한 색인 및 검색 기법 (Indexing and Retrieval Mechanism using Variation Patterns of Theme Melodies in Content-based Music Information Retrievals)

구경이;신창환;김유성
- 한국정보과학회논문지:데이타베이스
- /
- 제30권5호
- /
- pp.507-520
- /
- 2003
본 연구에서는 내용 기반 음악 정보 검색 시스템의 검색 속도를 증진하기 위해 음악의 대표 선율인 주제 선율을 추출하여 주제 선율 색인을 구성하고 이를 이용한 효율적인 내용 기반 음악 정보 검색 기법을 제안하였다. 추출된 주제 선율을 다차원 공간 색인 기법인 M-tree를 이용하여 주제 선율 색인으로 구성하기 위해 주제 선율의 평균 음 높이 변화량과 평균 음 길이 변화량을 이용하였으며 검색의 정확도를 증진하기 위해 음 높이 변화 패턴을 요약한 높이 시그니처와 음 길이 변화 패턴을 요약한 길이 시그니처를 이용하였다. 또한 제안된 내용 기반 음악 정보 검색 기법에서는 사용자의 질의 선율로부터 질의 선율의 패턴 정보를 구성하고 M-tree의 k-근접 검색 및 범위 검색 기법을 이용하여 사용자의 질의 선율과 유사한 주제 선율을 포함하고 있는 음악 정보를 검색한다. 검색된 결과로부터 순위 부여한 후 사용자 피드백을 하여 사용자의 만족도를 증진하기 위한 특성을 포함하도록 하였다. 또한, 본 논문에서 제안된 주제 선율 색인 기법 및 내용 기반 검색 기법을 포함한 내용 기반 음악 정보 검색 시스템의 프로토타입을 구현하여 제안된 기법의 실효성을 입증하였다.
PDF KSCI

억양의 시각화를 통한 프랑스어의 억양학습 (Learning French Intonation with a Base of the Visualization of Melody)

이정원
- 음성과학
- /
- 제10권4호
- /
- pp.63-71
- /
- 2003
This study aims to experiment on learning French intonation, based on the visualization of melody, which was employed in the early sixties to reeducate those with communication disorders. The visualization of melody in this paper, however, was used to the foreign language learning and produced successful results in many ways, especially in learning foreign intonation. In this paper, we used the PitchWorks to visualize some French intonation samples and experiment on learning intonation based on the bitmap picture projected on a screen. The students could see the melody curve while listening to the sentences. We could observe great achievement on the part of the students in learning intonations, as verified by the result of this experiment. The students were much more motivated in learning and showed greater improvement in recognizing intonation contour than just learning by hearing. But lack of animation in the bitmap file could make the experiment nothing but a boring pattern practices. It would be better if we can use a sound analyser, as like for instance a PitchWorks, which is designed to analyse the pitch, since the students can actually see their own fluctuating intonation visualized on the screen.
PDF

검색결과 116건 처리시간 0.028초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)