• Title, Summary, Keyword: Speech Enhancement

Search Result 314, Processing Time 0.04 seconds

Vocal separation method using weighted β-order minimum mean square error estimation based on kernel back-fitting (커널 백피팅 알고리즘 기반의 가중 β-지수승 최소평균제곱오차 추정방식을 적용한 보컬음 분리 기법)

  • Cho, Hye-Seung;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.1
    • /
    • pp.49-54
    • /
    • 2016
  • In this paper, we propose a vocal separation method using weighted ${\beta}$-order minimum mean wquare error estimation (WbE) based on kernel back-fitting algorithm. In spoken speech enhancement, it is well-known that the WbE outperforms the existing Bayesian estimators such as the minimum mean square error (MMSE) of the short-time spectral amplitude (STSA) and the MMSE of the logarithm of the STSA (LSA), in terms of both objective and subjective measures. In the proposed method, WbE is applied to a basic iterative kernel back-fitting algorithm for improving the vocal separation performance from monaural music signal. The experimental results show that the proposed method achieves better separation performance than other existing methods.

A Correlational Study on Activities of Daily Living, Self-efficacy, Stroke Specific Qualify of Life and Need for Self-help Management Programs for Patients with Hemiplegia at Home (재가 뇌졸중환자의 일상생활활동, 자기효능감, 삶의 질, 자조관리프로그램요구도와의 관계에 관한 연구)

  • Kim Keum-Soon
    • Journal of Korean Academy of Fundamentals of Nursing
    • /
    • v.8 no.1
    • /
    • pp.81-94
    • /
    • 2001
  • The purpose of this study was to identify levels of activity of daily living, self-efficacy. stroke specific quality of life and need for self-help management program for patients with hemiplegia in the home. Data were collected from June to November, 2000 and subjects were 88 poststroke patients who lived in Seoul and Kyunggi-do. The questionnaire consisted of 5 scales: activities of daily living, self-efficacy, stroke specific qulaity of life and need for a self-help management program. Data were analyzed using frequencies, percent, paired t-test, and Pearson's correlation coefficient with the SAS(version 6.12) program. The results are as follows ; 1) Most of subjects were Partially independent in ADL, but they needed assist once to do dressing, bathing meal preparation and house keeping work. 2) The mean self-efficacy score was 54.89(range : 1 to 80) and the individual differences were large. 3) Subjects responded that they were satisfied on the stroke specific quality of life scale totaled 65.8%. This value is comparatively low, especially for social role(51.4%), family functioning(58.3%) and mood (62.2%). 4) The highest needs for self-help management programs were for physical therapy, stress management, and range of motion exercise and the lowest needs were for elimination management and training, family counseling, and speech therapy. 5) On the demographic variables, sex showed significant differences for the dependent variables. Females had higher scores than males for IADL, self-efficacy, stroke-specific quality of life, and need for self-help management. 6) Age had high negative correlation with ADL, self-efficacy and stroke specific quality of life. Age was also correlated with need for self-help management. In conclusion, there was a high correlation for ADL, Self-efficacy and Quality of life in poststroke patients of home. The patient with a stroke also had a strong need for self-help management programs especially physical therapy and stress management. Therefore rehabilitation programs based on self-efficacy enhancement need to be developed in order to promote independent living for patients with hemiplegia.

  • PDF

A Study on the Development Plan to Increase Supplement of Voice over Internet Protocol (인터넷전화의 보급 확산을 위한 발전방안에 관한 연구)

  • Park, Jae-Yong
    • Management & Information Systems Review
    • /
    • v.28 no.3
    • /
    • pp.191-210
    • /
    • 2009
  • Internet was first designed only for sending data, but as the time passed, internet started to evolve into a broadband multi-media web that is capable of transmitting sound, video, high-capacity data and more due to the demands of internet users and the rapid changing internet-communication technology. Domestically, in January, 2000 Saerom C&T, launched a free VoIP, but due to limited ways of conversation(PC to PC) and absence of a revenue model, and bad speech quality, it had hit it's growth limit. This research studied VoIP based on technological enhancement in super-speed internet. According to IDC, domestic internet market's size was 80,800 million in 2008, and it formed a percentage of 12.5% out of the whole sound-communication market. in case of VoIP, it is able to maximize it's profit by connecting cable and wireless network, also it has a chance of becoming firm-concentrated monopoly market by fusing with IPTV. Considering the fact that our country is insignificant in MVNO revitalization, regulating organizations will play a significant roll on regulating profit between large and small businesses. Further research should be done to give VoIP a secure footing to prosper and become popularized.

  • PDF

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.