Search | Korea Science

The Structural Relationships of between AI-based Voice Recognition Service Characteristics, Interactivity and Intention to Use (AI기반 음성인식 서비스 특성과 상호 작용성 및 이용 의도 간의 구조적 관계)

Lee, SeoYoung
- Journal of Information Technology Services
- /
- v.20 no.5
- /
- pp.189-207
- /
- 2021
Voice interaction combined with artificial intelligence is poised to revolutionize human-computer interactions with the advent of virtual assistants. This paper is analyzing interactive elements of AI-based voice recognition services such as sympathy, assurance, intimacy, and trust on intention to use. The questionnaire was carried out for 284 smartphone/smart TV users in Korea. The collected data was analyzed by structural equation model analysis and bootstrapping. The key results are as follows. First, AI-based voice recognition service characteristics such as sympathy, assurance, intimacy, and trust have positive effects on interactivity with the AI-based voice recognition service. Second, the interactivity with the AI-based voice recognition service has positive effects on intention to use. Third, AI-based voice recognition service characteristics such as interactional enjoyment and intimacy have directly positive effects on intention to use. Fourth, AI-based voice recognition service characteristics such as sympathy, assurance, intimacy and trust have indirectly positive effects on intention to use the AI-based voice recognition service by mediating the effect of the interactivity with the AI-based voice recognition service. It is meaningful to investigate factors affecting the interactivity and intention to use voice recognition assistants. It has practical and academic implications.
https://doi.org/10.9716/KITS.2021.20.5.187 인용 PDF KSCI

An Extension of the VoiceXML Platform for Push-based Voice Applications (푸쉬형 음성 서비스를 위한 VoiceXML 플랫폼의 확장)

김경란;홍기형
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.1
- /
- pp.27-36
- /
- 2002
VoiceXML is a standard dialog mark-up language for the neat generation voice applications. The current VoiceXML 1.0 specification is silent on who place outbound calls for push-based voice applications. The push-barred voice applications become very important in modern information systems such as CRM. In this paper, we design and implement an extended VoiceXML platform that supports both inbound and outbound voice information services. We also extend the VoiceXML DTD so as to be able to inbound/outbound fax based on Call Control Requirements of W3C.
PDF KSCI

The Interactive Voice Services based on VoiceXML (VoiceXML 기반 음성인식시스템을 이용한 서비스 개발)

Kim Hak-Gyoon;Kim Eun-Hyang;Kim Jae-In;Koo Myoung-Wan
- MALSORI
- /
- no.43
- /
- pp.113-125
- /
- 2002
As there are needs to search the Web information via wire or wireless telephones, VoiceXML forum was established to develop and promote the Voice eXtensible Markup Language (VoiceXML). VoiceXML simplifies the creation of personalized interactive voice response services on the Web, and allows voice and phone access to information on Web sites, call center databases. Also, it can utilize the Web-based technologies, such as CGI(Common Gateway Interface) scripts. In this paper, we have developed the voice portal service platform based on VoiceXML called TeleGateway. It enables integration of voice services with data services using the Automatic Speech Recognition (ASR) and Text-To-Speech (TTS) engines. Also, we have showed the various services on voice portal services.
PDF

A Study on Realization of Speech Recognition System based on VoiceXML for Railroad Reservation Service (철도예약서비스를 위한 VoiceXML 기반의 음성인식 구현에 관한 연구)

Kim, Beom-Seung;Kim, Soon-Hyob
- Journal of the Korean Society for Railway
- /
- v.14 no.2
- /
- pp.130-136
- /
- 2011
This paper suggests realization method for real-time speech recognition using VoiceXML in telephony environment based on SIP for Railroad Reservation Service. In this method, voice signal incoming through PSTN or Internet is treated as dialog using VoiceXML and the transferred voice signal is processed by Speech Recognition System, and the output is returned to dialog of VoiceXML which is transferred to users. VASR system is constituted of dialog server which processes dialog, APP server for processing voice signal, and Speech Recognition System to process speech recognition. This realizes transfer method to Speech Recognition System in which voice signal is recorded using Record Tag function of VoiceXML to process voice signal in telephony environment and it is played in real time.
https://doi.org/10.7782/JKSR.2011.14.2.130 인용 PDF KSCI

Design and Implementation of the English Education Testing System Interface Based on VoiceXML (VoiceXML 기반 영어 교육 평가 시스템 설계 및 구현)

Jang, Seung Ju
- The Journal of Korean Association of Computer Education
- /
- v.8 no.6
- /
- pp.75-83
- /
- 2005
In this paper we studied English listening and speaking test part of foreign language using web and VoiceXML-based education testing system, which is irrespective of time and space. The testing system interface based on VoiceXML consists of user registration module, testing module, and testing result module. User registration module registers user's name and ID, password in user database, and when a tester calls for testing, the User listens to the telephone sound supported by vxml scenario. After that, if a tester logs in, the tester is verified, In the VoiceXML-based education testing system, the manager can reduce time and effort for gaining testing result. The tester listens to the voice by scenario supported by VoiceXML markup language using wire/wireless telephone at any time or anywhere and can improve the effect of foreign language studying by valuating in voice directly. verified. In the VoiceXML-based education testing system, the manager can reduce time and effort for gaining testing result. The tester listens to the voice by scenario supported by VoiceXML markup language using wire/wireless telephone at any time or anywhere and can improve the effect of foreign language studying by valuating in voice directly.
PDF

A Study on the Voice Interface for Mobile Environment (모바일기반 음성인터페이스에 관한 연구)

Kim, Soo-Hoon;Ahn, Jong-Young
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.13 no.1
- /
- pp.199-204
- /
- 2013
Google's android-based voice interface is limited to the web application and the users are rare. In this paper, We suggest the method that can be done using existing android-based voice engine and develope voice application. We also study the environments of android-based voice interface and present the appropriate voice interface in mobile environment.
https://doi.org/10.7236/JIIBC.2013.13.1.199 인용 PDF KSCI

Standardization Voice Training Method for Professional Voice User Based on Traditional (전통적 벨칸토 발성훈련법에 기초한 음성전문직업인 발성훈련의 표준화)

Kim, Chul Jun
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.28 no.1
- /
- pp.17-19
- /
- 2017
Opera singers train their vocal organ to have a good timbre of voice. They train and train again to have a strong resonance, large range of voice, homogenous color of voice, a voice goes far and to avoid vocal disorder, etc. This article is analyzing from scientific and medical perspective. It could approach the secret of the great art of 400 years history - . Furthermore standardizing voice training method based on will facilitate to train, therapy and care the voice professional user and voice disorders.
PDF

Voice-based Device Control Using oneM2M IoT Platforms

Jeong, Isu;Yun, Jaeseok
- Journal of the Korea Society of Computer and Information
- /
- v.24 no.3
- /
- pp.151-157
- /
- 2019
In this paper, we present a prototype system for controlling IoT home appliances via voice-based commands. A voice command has been widely deployed as one of unobtrusive user interfaces for applications in a variety of IoT domains. However, interoperability between diverse IoT systems is limited by several dominant companies providing voice assistants like Amazon Alexa or Google Now due to their proprietary systems. A global IoT standard, oneM2M has been proposed to mitigate the lack of interoperability between IoT systems. In this paper, we deployed oneM2M-based platforms for a voice record device like a wrist band and LED control device like a home appliance. We developed all the components for recording voices and controlling IoT devices, and demonstrate the feasibility of our proposed method based on oneM2M platforms and Google STT (Speech-to-Text) API for controlling home appliances by showing a user scenario for turning the LED device on and off via voice commands.
https://doi.org/10.9708/jksci.2019.24.03.151 인용 PDF KSCI HTML

Zero-shot voice conversion with HuBERT

Hyelee Chung;Hosung Nam
- Phonetics and Speech Sciences
- /
- v.15 no.3
- /
- pp.69-74
- /
- 2023
This study introduces an innovative model for zero-shot voice conversion that utilizes the capabilities of HuBERT. Zero-shot voice conversion models can transform the speech of one speaker to mimic that of another, even when the model has not been exposed to the target speaker's voice during the training phase. Comprising five main components (HuBERT, feature encoder, flow, speaker encoder, and vocoder), the model offers remarkable performance across a range of scenarios. Notably, it excels in the challenging unseen-to-unseen voice-conversion tasks. The effectiveness of the model was assessed based on the mean opinion scores and similarity scores, reflecting high voice quality and similarity to the target speakers. This model demonstrates considerable promise for a range of real-world applications demanding high-quality voice conversion. This study sets a precedent in the exploration of HuBERT-based models for voice conversion, and presents new directions for future research in this domain. Despite its complexities, the robust performance of this model underscores the viability of HuBERT in advancing voice conversion technology, making it a significant contributor to the field.
https://doi.org/10.13064/KSSS.2023.15.3.069 인용 PDF

Pitch Modification based on a Voice Source Model (음원 모델에 기초한 합성음의 피치 조절)

Choi, Yong-Jin;Yeo, Su-Jin;Kim, Jin-Young;Sung, Koeng-Mo
- Speech Sciences
- /
- v.3
- /
- pp.132-147
- /
- 1998
Previously developed methods for pitch modification have not been based on the voice source model. Therefore, the synthesized speech often sounds unnatural although it may be highly intelligible. The purpose of this paper is to analyze the alteration of a voice source signal with pitch period and to establish the pitch-modification rule based on the result of this analysis. We examine the alteration of the interval of closing phase, closed phase and open phase using the excitation waveform as the pitch increases. In comparison to the previous methods which performed directly on the speech signal, the pitch modification method based on a voice source model shows high intelligibility and naturalness. This study might benefit the application to the speaker identification and the voice color conversion. Therefore the proposed method will provide high quality synthetic speech.
PDF

Search Result 1,573, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)