• Title/Summary/Keyword: Research dataset

Search Result 1,186, Processing Time 0.037 seconds

Construction and Effectiveness Evaluation of Multi Camera Dataset Specialized for Autonomous Driving in Domestic Road Environment (국내 도로 환경에 특화된 자율주행을 위한 멀티카메라 데이터 셋 구축 및 유효성 검증)

  • Lee, Jin-Hee;Lee, Jae-Keun;Park, Jaehyeong;Kim, Je-Seok;Kwon, Soon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.17 no.5
    • /
    • pp.273-280
    • /
    • 2022
  • Along with the advancement of deep learning technology, securing high-quality dataset for verification of developed technology is emerging as an important issue, and developing robust deep learning models to the domestic road environment is focused by many research groups. Especially, unlike expressways and automobile-only roads, in the complex city driving environment, various dynamic objects such as motorbikes, electric kickboards, large buses/truck, freight cars, pedestrians, and traffic lights are mixed in city road. In this paper, we built our dataset through multi camera-based processing (collection, refinement, and annotation) including the various objects in the city road and estimated quality and validity of our dataset by using YOLO-based model in object detection. Then, quantitative evaluation of our dataset is performed by comparing with the public dataset and qualitative evaluation of it is performed by comparing with experiment results using open platform. We generated our 2D dataset based on annotation rules of KITTI/COCO dataset, and compared the performance with the public dataset using the evaluation rules of KITTI/COCO dataset. As a result of comparison with public dataset, our dataset shows about 3 to 53% higher performance and thus the effectiveness of our dataset was validated.

Towards Texture-Based Visualization of Multivariate Dataset

  • Mehmood, Raja Majid;Lee, Hyo Jong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.582-585
    • /
    • 2014
  • Visualization is a science which makes the invisible to visible through the techniques of experimental visualization and computer-aided visualization. This paper presents the practical aspects of visualization of multivariate dataset. In this paper, we will briefly discuss a previous research work and introduce a new visualization technique which will help us to design and develop a visualization tool for experimental visualization of multivariate dataset. Our newly developed visualization tool can be used in various domains. In this paper, we have chosen a software industry as an application domain and we used the multivariate dataset of software components computed by VizzMaintenance. VizzMaintenance is software analysis tool which give us multiple software metrics of open source Java based programs. Main objective of this research is to develop a new visualization tool for large multivariate dataset which will be more efficient and easy to perceive by viewer. Perception is very important for our research work and we have decided to test the perception level of our proposed visualization approach by researchers of our research lab.

Development of Korean Medicine Data Center(KDC) Teaching Dataset to Enhance Utilization of KDC (한의임상정보은행 활용도 제고를 위한 교육용 데이터 개발)

  • Baek, Younghwa;Lee, Siwoo
    • Journal of Sasang Constitutional Medicine
    • /
    • v.29 no.3
    • /
    • pp.242-247
    • /
    • 2017
  • Objective Korean medicine Data Center (KDC) has established large-scale biological and clinical data based on Korean medicine to demonstrate and validate its theory. The aim of this study was to develop KDC teaching dataset and user guideline to improve utilization of the KDC. Method KDC teaching dataset were selected using stratified random sampling according to the Sasang constitution (SC). This dataset included 72 variables of 500 sample subjects. The user guideline described how to conducted eight statistical analysis methods using the teaching dataset. Results The KDC teaching dataset was sampled from 200(40%) Taeeumin, 125(25%) Soeumin, and 175(35%) Soyanain. It was consisted of questionnaire (basic, habit, disease, symptom), physical exam (body measurement, blood pressure), blood exam, and expert' SC diagnosis. The usage guidelines provided instruction for users to perform several statistical analysis step by step with KDC teaching dataset. Conclusion We hope that our results will contribute to enhancing KDC utilization and understanding.

Real-world multimodal lifelog dataset for human behavior study

  • Chung, Seungeun;Jeong, Chi Yoon;Lim, Jeong Mook;Lim, Jiyoun;Noh, Kyoung Ju;Kim, Gague;Jeong, Hyuntae
    • ETRI Journal
    • /
    • v.44 no.3
    • /
    • pp.426-437
    • /
    • 2022
  • To understand the multilateral characteristics of human behavior and physiological markers related to physical, emotional, and environmental states, extensive lifelog data collection in a real-world environment is essential. Here, we propose a data collection method using multimodal mobile sensing and present a long-term dataset from 22 subjects and 616 days of experimental sessions. The dataset contains over 10 000 hours of data, including physiological, data such as photoplethysmography, electrodermal activity, and skin temperature in addition to the multivariate behavioral data. Furthermore, it consists of 10 372 user labels with emotional states and 590 days of sleep quality data. To demonstrate feasibility, human activity recognition was applied on the sensor data using a convolutional neural network-based deep learning model with 92.78% recognition accuracy. From the activity recognition result, we extracted the daily behavior pattern and discovered five representative models by applying spectral clustering. This demonstrates that the dataset contributed toward understanding human behavior using multimodal data accumulated throughout daily lives under natural conditions.

Generation of Super-Resolution Benchmark Dataset for Compact Advanced Satellite 500 Imagery and Proof of Concept Results

  • Yonghyun Kim;Jisang Park;Daesub Yoon
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.4
    • /
    • pp.459-466
    • /
    • 2023
  • In the last decade, artificial intelligence's dramatic advancement with the development of various deep learning techniques has significantly contributed to remote sensing fields and satellite image applications. Among many prominent areas, super-resolution research has seen substantial growth with the release of several benchmark datasets and the rise of generative adversarial network-based studies. However, most previously published remote sensing benchmark datasets represent spatial resolution within approximately 10 meters, imposing limitations when directly applying for super-resolution of small objects with cm unit spatial resolution. Furthermore, if the dataset lacks a global spatial distribution and is specialized in particular land covers, the consequent lack of feature diversity can directly impact the quantitative performance and prevent the formation of robust foundation models. To overcome these issues, this paper proposes a method to generate benchmark datasets by simulating the modulation transfer functions of the sensor. The proposed approach leverages the simulation method with a solid theoretical foundation, notably recognized in image fusion. Additionally, the generated benchmark dataset is applied to state-of-the-art super-resolution base models for quantitative and visual analysis and discusses the shortcomings of the existing datasets. Through these efforts, we anticipate that the proposed benchmark dataset will facilitate various super-resolution research shortly in Korea.

ETLi: Efficiently annotated traffic LiDAR dataset using incremental and suggestive annotation

  • Kang, Jungyu;Han, Seung-Jun;Kim, Nahyeon;Min, Kyoung-Wook
    • ETRI Journal
    • /
    • v.43 no.4
    • /
    • pp.630-639
    • /
    • 2021
  • Autonomous driving requires a computerized perception of the environment for safety and machine-learning evaluation. Recognizing semantic information is difficult, as the objective is to instantly recognize and distinguish items in the environment. Training a model with real-time semantic capability and high reliability requires extensive and specialized datasets. However, generalized datasets are unavailable and are typically difficult to construct for specific tasks. Hence, a light detection and ranging semantic dataset suitable for semantic simultaneous localization and mapping and specialized for autonomous driving is proposed. This dataset is provided in a form that can be easily used by users familiar with existing two-dimensional image datasets, and it contains various weather and light conditions collected from a complex and diverse practical setting. An incremental and suggestive annotation routine is proposed to improve annotation efficiency. A model is trained to simultaneously predict segmentation labels and suggest class-representative frames. Experimental results demonstrate that the proposed algorithm yields a more efficient dataset than uniformly sampled datasets.

Knowledge Model for Disaster Dataset Navigation

  • Hwang, Yun-Young;Yuk, Jin-Hee;Shin, Sumi
    • Journal of Information Science Theory and Practice
    • /
    • v.9 no.4
    • /
    • pp.35-49
    • /
    • 2021
  • In a situation where there are multiple diverse datasets, it is essential to have an efficient method to provide users with the datasets they require. To address this suggestion, necessary datasets should be selected on the basis of the relationships between the datasets. In particular, in order to discover the necessary datasets for disaster resolution, we need to consider the disaster resolution stage. In this paper, in order to provide the necessary datasets for each stage of disaster resolution, we constructed a disaster type and disaster management process ontology and designed a method to determine the necessary datasets for each disaster type and disaster management process step. In addition, we introduce a method to determine relationships between datasets necessary for disaster response. We propose a method for discovering datasets based on minimal relationships such as "isA," "sameAs," and "subclassOf." To discover suitable datasets, we designed a knowledge exploration model and collected 651 disaster-related datasets for improving our method. These datasets were categorized by disaster type from the perspective of disaster management. Categorizing actual datasets into disaster types and disaster management types allows a single dataset to be classified as multiple types in both categories. We built a knowledge exploration model on the basis of disaster examples to ensure the configuration of our model.

Derivation of Typical Meteorological Year of Daejeon from Satellite-Based Solar Irradiance (위성영상 기반 일사량을 활용한 대전지역 표준기상년 데이터 생산)

  • Kim, Chang Ki;Kim, Shin-Young;Kim, Hyun-Goo;Kang, Yong-Heack;Yun, Chang-Yeol
    • Journal of the Korean Solar Energy Society
    • /
    • v.38 no.6
    • /
    • pp.27-36
    • /
    • 2018
  • Typical Meteorological Year Dataset is necessary for the renewable energy feasibility study. Since National Renewable Energy Laboratory has been built Typical Meteorological Year Dataset in 1978, gridded datasets taken from numerical weather prediction or satellite imagery are employed to produce Typical Meteorological Year Dataset. In general, Typical Meteorological Year Dataset is generated by using long-term in-situ observations. However, solar insolation is not usually measured at synoptic observing stations and therefore it is limited to build the Typical Meteorological Year Dataset with only in-situ observation. This study attempts to build the Typical Meteorological Year Dataset with satellite derived solar insolation as an alternative and then we evaluate the Typical Meteorological Year Dataset made by using satellite derived solar irradiance at Daejeon ground station. The solar irradiance is underestimated when satellite imagery is employed.

A Manually Captured and Modified Phone Screen Image Dataset for Widget Classification on CNNs

  • Byun, SungChul;Han, Seong-Soo;Jeong, Chang-Sung
    • Journal of Information Processing Systems
    • /
    • v.18 no.2
    • /
    • pp.197-207
    • /
    • 2022
  • The applications and user interfaces (UIs) of smart mobile devices are constantly diversifying. For example, deep learning can be an innovative solution to classify widgets in screen images for increasing convenience. To this end, the present research leverages captured images and the ReDraw dataset to write deep learning datasets for image classification purposes. First, as the validation for datasets using ResNet50 and EfficientNet, the experiments show that the dataset composed in this study is helpful for classification according to a widget's functionality. An implementation for widget detection and classification on RetinaNet and EfficientNet is then executed. Finally, the research suggests the Widg-C and Widg-D datasets-a deep learning dataset for identifying the widgets of smart devices-and implementing them for use with representative convolutional neural network models.

Sign Language Dataset Built from S. Korean Government Briefing on COVID-19 (대한민국 정부의 코로나 19 브리핑을 기반으로 구축된 수어 데이터셋 연구)

  • Sim, Hohyun;Sung, Horyeol;Lee, Seungjae;Cho, Hyeonjoong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.8
    • /
    • pp.325-330
    • /
    • 2022
  • This paper conducts the collection and experiment of datasets for deep learning research on sign language such as sign language recognition, sign language translation, and sign language segmentation for Korean sign language. There exist difficulties for deep learning research of sign language. First, it is difficult to recognize sign languages since they contain multiple modalities including hand movements, hand directions, and facial expressions. Second, it is the absence of training data to conduct deep learning research. Currently, KETI dataset is the only known dataset for Korean sign language for deep learning. Sign language datasets for deep learning research are classified into two categories: Isolated sign language and Continuous sign language. Although several foreign sign language datasets have been collected over time. they are also insufficient for deep learning research of sign language. Therefore, we attempted to collect a large-scale Korean sign language dataset and evaluate it using a baseline model named TSPNet which has the performance of SOTA in the field of sign language translation. The collected dataset consists of a total of 11,402 image and text. Our experimental result with the baseline model using the dataset shows BLEU-4 score 3.63, which would be used as a basic performance of a baseline model for Korean sign language dataset. We hope that our experience of collecting Korean sign language dataset helps facilitate further research directions on Korean sign language.