Search | Korea Science

A Study on Noise-Robust Methods for Broadcast News Speech Recognition (방송뉴스 인식에서의 잡음 처리 기법에 대한 고찰)

Chung Yong-joo
- MALSORI
- /
- no.50
- /
- pp.71-83
- /
- 2004
Recently, broadcast news speech recognition has become one of the most attractive research areas. If we can transcribe automatically the broadcast news and store their contents in the text form instead of the video or audio signal itself, it will be much easier for us to search for the multimedia databases to obtain what we need. However, the desirable speech signal in the broadcast news are usually affected by the interfering signals such as the background noise and/or the music. Also, the speech of the reporter who is speaking over the telephone or with the ill-conditioned microphone is severely distorted by the channel effect. The interfered or distorted speech may be the main reason for the poor performance in the broadcast news speech recognition. In this paper, we investigated some methods to cope with the problems and we could see some performance improvements in the noisy broadcast news speech recognition.
PDF

Korean Broadcast News Transcription Using Morpheme-based Recognition Units

Kwon, Oh-Wook;Alex Waibel
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.1E
- /
- pp.3-11
- /
- 2002
Broadcast news transcription is one of the hardest tasks in speech recognition because broadcast speech signals have much variability in speech quality, channel and background conditions. We developed a Korean broadcast news speech recognizer. We used a morpheme-based dictionary and a language model to reduce the out-of·vocabulary (OOV) rate. We concatenated the original morpheme pairs of short length or high frequency in order to reduce insertion and deletion errors due to short morphemes. We used a lexicon with multiple pronunciations to reflect inter-morpheme pronunciation variations without severe modification of the search tree. By using the merged morpheme as recognition units, we achieved the OOV rate of 1.7% comparable to European languages with 64k vocabulary. We implemented a hidden Markov model-based recognizer with vocal tract length normalization and online speaker adaptation by maximum likelihood linear regression. Experimental results showed that the recognizer yielded 21.8% morpheme error rate for anchor speech and 31.6% for mostly noisy reporter speech.
PDF KSCI

A Comparative Study of the Diachronic Change in the Transmission Rate of Broadcast Messages (방송 메시지 전달 속도의 통시적 비교에 관한 연구: 라디오뉴스 전달 속도 분석을 중심으로)

Park, Kyung-Hee
- MALSORI
- /
- no.64
- /
- pp.15-37
- /
- 2007
The purpose of this paper is to examine the change of the times on the transmission rate of broadcast message. In order to find out the research results, I collected past recorded news tapes and selected 22 radio news out from era of Japanese Imperialism, 1950's, 1960's and contemporary age. Next I measured each announcer's reading rate, and compared change on news-reading rate between present and past approximately 50 years ago. The results of study with such procedures and methods are as follows : the average reporting rate of newscasters in each era is different. From these results, we can easily grasp diachronic change in the transmission rate of broadcast message. Namely, the results show us that present announcers read news faster than the group of past era of Japanese Imperialism by 68%.
PDF

Retrieval of Broadcast News Using Audio Content Analysis

Kim, Hyoung-Gook
- The Journal of the Acoustical Society of Korea
- /
- v.26 no.3E
- /
- pp.74-79
- /
- 2007
In this paper, we report our recent work on a indexing and retrieval system of broadcast news using audio content analysis. Key issues addressed in this work are two major parts of the audio indexing system: anchorperson detection based on audio segmentation, and phone-based spoken document retrieval, developed in the framework of the emerging MPEG-7 standard. Experiments are conducted on a database of Britisch broadcast news videos. We discuss the development of the retrieval system, and the evaluation of each part and the retrieval system.
PDF KSCI

Language Model Adaptation for Broadcast News Recognition (방송 뉴스 인식을 위한 언어 모델 적응)

Kim Hyun Suk;Jeon Hyung Bae;Kim Sanghun;Choi Joon Ki;Yun Seung
- MALSORI
- /
- no.51
- /
- pp.99-115
- /
- 2004
In this parer, we propose LM adaptation for broadcast news recognition. We collect information of recent articles from the internet on real time, make a recent small size LM, and then interpolate recent LM with a existing LM composed of existing large broadcast news corpus. We performed interpolation experiments to get the best type of articles from recent corpus because collected recent corpus is composed of articles which are related with test set, and which are unrelated. When we made an adapted LM using recent LM with similar articles to test set through Tf-Idf method and existing LM, we got the best result that ERR of pseudo-morpheme based recognition performance has 17.2 % improvement and the number of OOV has reduction from 70 to 27.
PDF

Statistical Analysis Between Size and Balance of Text Corpus by Evaluation of the effect of Interview Sentence in Language Modeling (언어모델 인터뷰 영향 평가를 통한 텍스트 균형 및 사이즈간의 통계 분석)

Jung Eui-Jung;Lee Youngjik
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.87-90
- /
- 2002
This paper analyzes statistically the relationship between size and balance of text corpus by evaluation of the effect of interview sentences in language model for Korean broadcast news transcription system. Our Korean broadcast news transcription system's ultimate purpose is to recognize not interview speech, but the anchor's and reporter's speech in broadcast news show. But the gathered text corpus for constructing language model consists of interview sentences a portion of the whole, $15\%$ approximately. The characteristic of interview sentence is different from the anchor's and the reporter's in one thing or another. Therefore it disturbs the anchor and reporter oriented language modeling. In this paper, we evaluate the effect of interview sentences in language model for Korean broadcast news transcription system and analyze statistically the relationship between size and balance of text corpus by making an experiment as the same procedure according to varying the size of corpus.
PDF

Analysis of Emotions in Broadcast News Using Convolutional Neural Networks (CNN을 활용한 방송 뉴스의 감정 분석)

Nam, Youngja
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.24 no.8
- /
- pp.1064-1070
- /
- 2020
In Korea, video-based news broadcasters are primarily classified into terrestrial broadcasters, general programming cable broadcasters and YouTube broadcasters. Recently, news broadcasters get subjective while targeting the desired specific audience. This violates normative expectations of impartiality and neutrality on journalism from its audience. This phenomenon may have a negative impact on audience perceptions of issues. This study examined whether broadcast news reporting conveys emotions and if so, how news broadcasters differ according to emotion type. Emotion types were classified into neutrality, happiness, sadness and anger using a convolutional neural network which is a class of deep neural networks. Results showed that news anchors or reporters tend to express their emotions during TV broadcasts regardless of broadcast systems. This study provides the first quantative investigation of emotions in broadcasting news. In addition, this study is the first deep learning-based approach to emotion analysis of broadcasting news.
https://doi.org/10.6109/jkiice.2020.24.8.1064 인용 PDF KSCI

Korean LVCSR for Broadcast News Speech

Lee, Gang-Seong
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.2E
- /
- pp.3-8
- /
- 2001
In this paper, we will examine a Korean large vocabulary continuous speech recognition (LVCSR) system for broadcast news speech. The combined vowel and implosive unit is included in a phone set together with other short phone units in order to obtain a longer unit acoustic model. The effect of this unit is compared with conventional phone units. The dictionary units for language processing are automatically extracted from eojeols appearing in transcriptions. Triphone models are used for acoustic modeling and a trigram model is used for language modeling. Among three major speaker groups in news broadcasts-anchors, journalists and people (those other than anchors or journalists, who are being interviewed), the speech of anchors and journalists, which has a lot of noise, was used for testing and recognition.
PDF

Introduction of ETRI Broadcast News Speech Recognition System (ETRI 방송뉴스음성인식시스템 소개)

Park Jun
- Proceedings of the KSPS conference
- /
- 2006.05a
- /
- pp.89-93
- /
- 2006
This paper presents ETRI broadcast news speech recognition system. There are two major issues on the broadcast news speech recognition: 1) real-time processing and 2) out-of-vocabulary handling. For real-time processing, we devised the dual decoder architecture. The input speech signal is segmented based on the long-pause between utterances, and each decoder processes the speech segment alternatively. One decoder can start to recognize the current speech segment without waiting for the other decoder to recognize the previous speech segment completely. Thus, the processing delay is not accumulated. For out-of-vocabulary handling, we updated both the vocabulary and the language model, based on the recent news articles on the internet. By updating the language model as well as the vocabulary, we can improve the performance up to 17.2% ERR.
PDF

Analysis of the Types of News Stories on the Online Broadcast -Focusing upon the Broadcasting Websites of NAVER Newsstand- (온라인 방송의 뉴스기사 유형에 대한 분석 -네이버 뉴스스탠드의 방송사 홈페이지를 중심으로-)

Park, Kwang Soon
- Journal of Digital Convergence
- /
- v.19 no.3
- /
- pp.177-185
- /
- 2021
This paper aimed to grasp what the percentage in the types of news stories on the online broadcast is, which was conducted by analyzing the news stories of 9 broadcasting websites on the Naver newsstand. For the analysis, a total of 270 days' samples were selected, including 30 days per broadcast on 9 broadcasting websites. For a method of analysis, One-way ANOVA was used to examine the difference among broadcasting websites. The analysis was made centering with priorities given to the type of news stories by the composition of language, the type of genre as a standard of stories, and so on. As a result of analysis, all the programs in the off-line broadcast have been produced and transmitted as a video-typed story, but a half of those in on-line broadcast have been made up of the stories composed of photo and text. The online newspaper has been producing a new type of news' story using video-typed story or computer graphic while the online broadcast has actively been utilizing stories composed of photos and text, which are types of newspaper's stories. From above-mentioned results, it can be understood that the boundary among media is getting more and more indistinct on the environment of online media, showing the phenomenon that the type of broadcast's stories is becoming old-fashioned.
https://doi.org/10.14400/JDC.2021.19.3.177 인용 PDF KSCI

Search Result 97, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)