통합 검색 | Korea Science

Dual-stream Co-enhanced Network for Unsupervised Video Object Segmentation

Hongliang Zhu;Hui Yin;Yanting Liu;Ning Chen
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- 제18권4호
- /
- pp.938-958
- /
- 2024
Unsupervised Video Object Segmentation (UVOS) is a highly challenging problem in computer vision as the annotation of the target object in the testing video is unknown at all. The main difficulty is to effectively handle the complicated and changeable motion state of the target object and the confusion of similar background objects in video sequence. In this paper, we propose a novel deep Dual-stream Co-enhanced Network (DC-Net) for UVOS via bidirectional motion cues refinement and multi-level feature aggregation, which can fully take advantage of motion cues and effectively integrate different level features to produce high-quality segmentation mask. DC-Net is a dual-stream architecture where the two streams are co-enhanced by each other. One is a motion stream with a Motion-cues Refine Module (MRM), which learns from bidirectional optical flow images and produces fine-grained and complete distinctive motion saliency map, and the other is an appearance stream with a Multi-level Feature Aggregation Module (MFAM) and a Context Attention Module (CAM) which are designed to integrate the different level features effectively. Specifically, the motion saliency map obtained by the motion stream is fused with each stage of the decoder in the appearance stream to improve the segmentation, and in turn the segmentation loss in the appearance stream feeds back into the motion stream to enhance the motion refinement. Experimental results on three datasets (Davis2016, VideoSD, SegTrack-v2) demonstrate that DC-Net has achieved comparable results with some state-of-the-art methods.
https://doi.org/10.3837/tiis.2024.04.007 인용 PDF HTML

Exon-Intron이론을 활용한 상황중심 데이터 스트림 분할 방안 (A Novel Way of Context-Oriented Data Stream Segmentation using Exon-Intron Theory)

이승훈;서동혁
- 한국전자통신학회논문지
- /
- 제16권5호
- /
- pp.799-806
- /
- 2021
사물인터넷 환경에서는 센서로부터의 이벤트 데이터가 시간의 흐름에 따라 지속적으로 보고된다. 이러한 추세로 입수되는 이벤트 데이터는 무한정 쌓이게 되므로 데이터의 효율적인 분석과 관리를 위한 방안이 필요하다. 본 연구에서는 지속적으로 보고되어 유입되는 센서로부터의 이벤트 데이터에 대하여 효과적인 선택과 활용을 뒷받침 할 수 있도록 하는 데이터 스트림 분할 방안을 제안하였다. 분석 처리를 시작할 지점을 식별하기 위한 식별자를 선정하도록 하였다. 이러한 식별자의 역할을 존치시킴으로써 분석할 대상을 명확하게 할 수 있으며 데이터 처리량을 감소시킬 수 있다. 본 연구에서 제안하는 스트림 분할을 위한 식별자는 각 스트림의 이벤트 발생을 기준으로 하기에 의미 중심의 데이터 스트림 분할 방안이라고 할 수 있다. 스트림 처리에서의 식별자의 존재는 대용량의 지속적인 데이터 유입환경에서 효율성을 제공하고 비용을 저감하는 측면에서 유용하다고 할 수 있다.
https://doi.org/10.13067/JKIECS.2021.16.5.799 인용 PDF KSCI

딥-러닝을 활용한 안드로이드 플랫폼에서의 이미지 시맨틱 분할 구현 (Implementation of Image Semantic Segmentation on Android Device using Deep Learning)

이용환;김영섭
- 반도체디스플레이기술학회지
- /
- 제19권2호
- /
- pp.88-91
- /
- 2020
Image segmentation is the task of partitioning an image into multiple sets of pixels based on some characteristics. The objective is to simplify the image into a representation that is more meaningful and easier to analyze. In this paper, we apply deep-learning to pre-train the learning model, and implement an algorithm that performs image segmentation in real time by extracting frames for the stream input from the Android device. Based on the open source of DeepLab-v3+ implemented in Tensorflow, some convolution filters are modified to improve real-time operation on the Android platform.
PDF KSCI

Speaker Change Detection Based on a Graph-Partitioning Criterion

Seo, Jin-Soo
- 한국음향학회지
- /
- 제30권2호
- /
- pp.80-85
- /
- 2011
Speaker change detection involves the identification of time indices of an audio stream, where the identity of the speaker changes. In this paper, we propose novel measures for the speaker change detection based on a graph-partitioning criterion over the pairwise distance matrix of feature-vector stream. Experiments on both synthetic and real-world data were performed and showed that the proposed approach yield promising results compared with the conventional statistical measures.
https://doi.org/10.7776/ASK.2011.30.2.080 인용 PDF KSCI

An Automatic Road Sign Recognizer for an Intelligent Transport System

Miah, Md. Sipon;Koo, Insoo
- Journal of information and communication convergence engineering
- /
- 제10권4호
- /
- pp.378-383
- /
- 2012
This paper presents the implementation of an automatic road sign recognizer for an intelligent transport system. In this system, lists of road signs are processed with actions such as line segmentation, single sign segmentation, and storing an artificial sign in the database. The process of taking the video stream and extracting the road sign and storing in the database is called the road sign recognition. This paper presents a study on recognizing traffic sign patterns using a segmentation technique for the efficiency and the speed of the system. The image is converted from one scale to another scale such as RGB to grayscale or grayscale to binary. The images are pre-processed with several image processing techniques, such as threshold techniques, Gaussian filters, Canny edge detection, and the contour technique.
https://doi.org/10.6109/jicce.2012.10.4.378 인용 PDF KSCI

DHMM과 어휘해석을 이용한 Voice dialing 시스템 (The Voice Dialing System Using Dynamic Hidden Markov Models and Lexical Analysis)

최성호;이강성;김순협
- 전자공학회논문지B
- /
- 제28B권7호
- /
- pp.548-556
- /
- 1991
In this paper, Korean spoken continuous digits are ercognized using DHMM(Dynamic Hidden Markov Model) and lexical analysis to provide the base of developing voice dialing system. After segmentation by phoneme unit, it is recognized. This system can be divided into the segmentation section, the design of standard speech section, the recognition section, and the lexical analysis section. In the segmentation section, it is segmented using the ZCR, O order LPC cepstrum, and Ai, parameter of voice speech dectaction, which is changed according to time. In the standard speech design section, 19 phonemes or syllables are trained by DHMM and designed as a standard speech. In the recognition section, phomeme stream are recognized by the Viterbi algorithm.In the lexical decoder section, finally recognized continuous digits are outputed. This experiment shiwed the recognition rate of 85.1% using data spoken 7 times of 21 classes of 7 continuous digits which are combinated all of the occurence, spoken by 10 man.
PDF

HSPF 유역모델을 이용한 낙동강유역 실시간 수문 유출 예측 (Operational Hydrological Forecast for the Nakdong River Basin Using HSPF Watershed Model)

신창민;나은혜;이은정;김덕길;민중혁
- 한국물환경학회지
- /
- 제29권2호
- /
- pp.212-222
- /
- 2013
A watershed model was constructed using Hydrological Simulation Program Fortran to quantitatively predict the stream flows at major tributaries of Nakdong River basin, Korea. The entire basin was divided into 32 segments to effectively account for spatial variations in meteorological data and land segment parameter values of each tributary. The model was calibrated at ten tributaries including main stream of the river for a three-year period (2008 to 2010). The deviation values (Dv) of runoff volumes for operational stream flow forecasting for a six month period (2012.1.2 to 2012.6.29) at the ten tributaries ranged from -38.1 to 23.6%, which is on average 7.8% higher than those of runoff volumes for model calibration (-12.5 to 8.2%). The increased prediction errors were mainly from the uncertainties of numerical weather prediction modeling; nevertheless the stream flow forecasting results presented in this study were in a good agreement with the measured data.
KSCI

The Role of Post-lexical Intonational Patterns in Korean Word Segmentation

Kim, Sa-Hyang
- 음성과학
- /
- 제14권1호
- /
- pp.37-62
- /
- 2007
The current study examines the role of post-lexical tonal patterns of a prosodic phrase in word segmentation. In a word spotting experiment, native Korean listeners were asked to spot a disyllabic or trisyllabic word from twelve syllable speech stream that was composed of three Accentual Phrases (AP). Words occurred with various post-lexical intonation patterns. The results showed that listeners spotted more words in phrase-initial than in phrase-medial position, suggesting that the AP-final H tone from the preceding AP helped listeners to segment the phrase-initial word in the target AP. Results also showed that listeners' error rates were significantly lower when words occurred with initial rising tonal pattern, which is the most frequent intonational pattern imposed upon multisyllabic words in Korean, than with non-rising patterns. This result was observed both in AP-initial and in AP-medial positions, regardless of the frequency and legality of overall AP tonal patterns. Tonal cues other than initial rising tone did not positively influence the error rate. These results not only indicate that rising tone in AP-initial and AP_final position is a reliable cue for word boundary detection for Korean listeners, but further suggest that phrasal intonation contours serve as a possible word boundary cue in languages without lexical prominence.
PDF

실시간 멀티미디어 서비스용 동기 통신 프로토콜의 성능 분석 (Performance Analysis of Synchronization Communication Protocols for Real-Time Multimedia Services)

김태규;조동호
- 전자공학회논문지A
- /
- 제31A권4호
- /
- pp.1-10
- /
- 1994
In the real-time delivery of multimedia data streams over networks, the interruption of continuity in a single media stream and the mismatching of the data within the same time interval in multimedia data streams transfered in paralled on different channels are considered as the most serious synchronization problems. There are several mechanisms proposed to handle these problems. In this paper, these mechanisms are analyzed and compared in various point of view by the computer simulation. According to the simulation results, it has been shown that the method which uses the segmentation and the method which uses the seperate synchronization channel are superior to the method which uses the synchronization marks in view of the real-time transmission and quality of sevice. On the other hand, it can be seen that the method which uses the segmentation is superior to the method which uses the seperate synchronization channel from a channel utilization's point of view.
PDF

강음절이 한국어 화자의 영어 연속 음성의 어휘 분절에 미치는 영향 (The Effect of Strong Syllables on Lexical Segmentation in English Continuous Speech by Korean Speakers)

김선미;남기춘
- 말소리와 음성과학
- /
- 제5권2호
- /
- pp.43-51
- /
- 2013
English native listeners have a tendency to treat strong syllables in a speech stream as the potential initial syllables of new words, since the majority of lexical words in English have a word-initial stress. The current study investigates whether Korean (L1) - English (L2) late bilinguals perceive strong syllables in English continuous speech as word onsets, as English native listeners do. In Experiment 1, word-spotting was slower when the word-initial syllable was strong, indicating that Korean listeners do not perceive strong syllables as word onsets. Experiment 2 was conducted in order to avoid any possibilities that the results of Experiment 1 may be due to the strong-initial targets themselves used in Experiment 1 being slower to recognize than the weak-initial targets. We employed the gating paradigm in Experiment 2, and measured the Isolation Point (IP, the point at which participants correctly identify a word without subsequently changing their minds) and the Recognition Point (RP, the point at which participants correctly identify the target with 85% or greater confidence) for the targets excised from the non-words in the two conditions of Experiment 1. Both the mean IPs and the mean RPs were significantly earlier for the strong-initial targets, which means that the results of Experiment 1 reflect the difficulty of segmentation when the initial syllable of words was strong. These results are consistent with Kim & Nam (2011), indicating that strong syllables are not perceived as word onsets for Korean listeners and interfere with lexical segmentation in English running speech.
https://doi.org/10.13064/KSSS.2013.5.2.043 인용 PDF

검색결과 33건 처리시간 0.025초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)