Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Journal of the Korean Data and Information Science Society
Journal Basic Information
Journal DOI :
Korean Data and Information Science Society
Editor in Chief :
Volume & Issues
Volume 26, Issue 6 - Nov 2015
Volume 26, Issue 5 - Sep 2015
Volume 26, Issue 4 - Jul 2015
Volume 26, Issue 3 - May 2015
Volume 26, Issue 2 - Mar 2015
Volume 26, Issue 1 - Jan 2015
Selecting the target year
Enhancing the performance of taxi application based on in-memory data grid technology
Choi, Chi-Hwan ; Kim, Jin-Hyuk ; Park, Min-Kyu ; Kwon, Kaaen ; Jung, Seung-Hyun ; Nazareno, Franco ; Cho, Wan-Sup ;
Journal of the Korean Data and Information Science Society, volume 26, issue 5, 2015, Pages 1035~1045
DOI : 10.7465/jkdi.2015.26.5.1035
Recent studies in Big Data Analysis are showing promising results, utilizing the main memory for rapid data processing. In-memory computing technology can be highly advantageous when used with high-performing servers having tens of gigabytes of RAM with multi-core processors. The constraint in network in these infrastructure can be lessen by combining in-memory technology with distributed parallel processing. This paper discusses the research in the aforementioned concept applying to a test taxi hailing application without disregard to its underlying RDBMS structure. The application of IMDG technology in the application`s backend API without restructuring the database schema yields 6 to 9 times increase in performance in data processing and throughput. Specifically, the change in throughput is very small even with increase in data load processing.
Predicting tobacco risk factors by using social big data
Song, Tae Min ; Song, Juyoung ; Cheon, Mi Kyung ;
Journal of the Korean Data and Information Science Society, volume 26, issue 5, 2015, Pages 1047~1059
DOI : 10.7465/jkdi.2015.26.5.1047
This study will predict risk factors associated with cigarettes in Korea by analyzing the social big data collected from the internet such as blogs, cafes, and SNSes in Korea, using data mining techniques. The key analysis results are as follows. First, when "raising cigarette price"is mentioned online, the negative group (i.e., the proportion of people holding negative views about smoking) increased from 58.6% to 74.8%, and when "lung cancer" is mentioned, it increased to 73.1%. Second, with regard to cigarettes in general, the positive group (i.e., the proportion of people holding positive views about smoking) decreased by 5.6% after the raising of cigarette prices, while the negative group increased by 6.1%. Third, when policies related to "FCTC, raising cigarette price, non-smoking laws, smoking regulations, non-smoking ads, and nonsmoking business" are more frequently mentioned online, the positive group tended to decrease. Finally, when "non-smoking drugs, non-smoking patches, and non-smoking gums" are more frequently mentioned online, the positive group tended to decrease. However, when "electronic cigarettes and supplements" are more frequently mentioned online, the positive group increased.
Latent mobility pattern analysis of bus passengers with LDA
Cho, Ah ; Lee, Kyung Hee ; Cho, Wan Sup ;
Journal of the Korean Data and Information Science Society, volume 26, issue 5, 2015, Pages 1061~1069
DOI : 10.7465/jkdi.2015.26.5.1061
Recently, transportation big data generated in the transportation sector has been widely used in the transportation policies making and efficient system management. Bus passengers` mobility patterns are useful insight for transportation policy maker to optimize bus lines and time intervals in a city. We propose a new methodology to discover mobility patterns by using transportation card data. We first estimate the bus stations where the passengers get-off because the transportation card data don`t have the get-off information in most cities. We then applies LDA (Latent Dirichlet Allocation), the most representative topic modeling technique, to discover mobility patterns of bus passengers in Cheong-Ju city. To understand discovered patterns, we construct a data warehouse and perform multi-dimensional analysis by bus-route, region, time-period, and the mobility patterns (get-on/get-off station). In the case of Cheong Ju, we discovered mobility pattern 1 from suburban area to Cheong-Ju terminal, mobility pattern 2 from residential area to commercial area, mobility pattern 3 from school areas to commercial area.
A study on the nation images of the big three exporting countries in East Asia shown in Wikipedia English-Edition
Lee, Youngwhan ; Chun, Heuiju ; Sawng, Youngwha ;
Journal of the Korean Data and Information Science Society, volume 26, issue 5, 2015, Pages 1071~1085
DOI : 10.7465/jkdi.2015.26.5.1071
The researchers attempted to develop a way to extract a near real-time online nation image using social media. Referring to previous studies about nation images and the categories defined in Wikipedia, an ontology considering the characteristics of nation image was constructed. Separately, data sets from various social media were compared and the click view of Wikipedia English-edition was selected. The ontology was applied to the recent six years of the data extracted of the three big exporting countries of the east Asia, China, Japan, and Korea. To compare the nation images, correspondence analysis was employed to show images in the area of politics, society, culture, and economy. The nation images extracted are indeed the reasonable representation of them. The researchers verified them to a few known government policies and confirmed that it could be used to help government officers to make foreign policies to boost nation`s export and to employ as a key performance index for them.
Influenza prediction models by using meteorological and social media informations
Hwang, Eun-Ji ; Na, Jong-Hwa ;
Journal of the Korean Data and Information Science Society, volume 26, issue 5, 2015, Pages 1087~1095
DOI : 10.7465/jkdi.2015.26.5.1087
Influenza, commonly known as "the flu", is an infectious disease caused by the influenza virus. We consider, in this paper, regression models as a prediction model of influenza disease. While most of previous researches use mainly the meteorological variables as a predictive variables, we consider social media information in the models. As a result, we found that the contributions of two-type of informations are comparable. We used the medical treatment data of influenza provided by Natioal Health Insurance Survice (NHIS) and the meteorological data provided by Korea Meteorological Administration (KMA). We collect social media information (twitter buzz amount) from Twitter. Time series model is also considered for comparison.
Crime risk implementation for safe return service
Park, Mi Ri ; Kim, Yu Sin ; Choi, Sang Hyun ;
Journal of the Korean Data and Information Science Society, volume 26, issue 5, 2015, Pages 1097~1104
DOI : 10.7465/jkdi.2015.26.5.1097
Rapid social and economic growth has brought positive results. At the same time, due to the increase in crime, crime prevention is important. There are many papers that analyze crime trends and crime type. Based on this, there are studies to ensure the safety of people. The study calculated the risk for the crime. it is necessary to exert a great effect on crime prevention alternatives. This paper uses crime data provided from San Francisco and victims data provided from FBI. And, it proposes the crime risk calculation. By analyzing the type of user, risk degree is given different weights according to the user, and assess the risk of crime.
Big data mining for natural disaster analysis
Kim, Young-Min ; Hwang, Mi-Nyeong ; Kim, Taehong ; Jeong, Chang-Hoo ; Jeong, Do-Heon ;
Journal of the Korean Data and Information Science Society, volume 26, issue 5, 2015, Pages 1105~1115
DOI : 10.7465/jkdi.2015.26.5.1105
Big data analysis for disaster have been recently started especially to text data such as social media. Social data usually supports for the final two stages of disaster management, which consists of four stages: prevention, preparation, response and recovery. Otherwise, big data analysis for meteorologic data can contribute to the prevention and preparation. This motivated us to review big data technologies dealing with non-text data rather than text in natural disaster area. To this end, we first explain the main keywords, big data, data mining and machine learning in sec. 2. Then we introduce the state-of-the-art machine learning techniques in meteorology-related field sec. 3. We show how the traditional machine learning techniques have been adapted for climatic data by taking into account the domain specificity. The application of these techniques in natural disaster response are then introduced (sec. 4), and we finally conclude with several future research directions.
Data analysis of 4M data in small and medium enterprises
Kim, Jae Sung ; Cho, Wan Sup ;
Journal of the Korean Data and Information Science Society, volume 26, issue 5, 2015, Pages 1117~1128
DOI : 10.7465/jkdi.2015.26.5.1117
In order to secure an important competitive advantage in manufacturing business, an automation and information system from manufacturing process has been introduced; however, small and medium enterprises have not met the power of information in the manufacturing fields. They have been managing the manufacturing process that is depending on the operator`s experience and data written by hand, which has limits to reveal cause of defective goods clearly, in the case of happening of low-grade goods. In this study, we analyze critical factors which affect the quality of some manufacturing process in terms of 4M. We also studied the automobile parts processing of the small and medium manufacturing enterprises controlled with data written by hand so as to collect the data written by hand and to utilize sensor data in the future. Analysis results show that there is no deference in defective quantity in machines, while raw materials, production quality and task tracking have significant deference.
An elastic distributed parallel Hadoop system for bigdata platform and distributed inference engines
Song, Dong Ho ; Shin, Ji Ae ; In, Yean Jin ; Lee, Wan Gon ; Lee, Kang Se ;
Journal of the Korean Data and Information Science Society, volume 26, issue 5, 2015, Pages 1129~1139
DOI : 10.7465/jkdi.2015.26.5.1129
Inference process generates additional triples from knowledge represented in RDF triples of semantic web technology. Tens of million of triples as an initial big data and the additionally inferred triples become a knowledge base for applications such as QA(question&answer) system. The inference engine requires more computing resources to process the triples generated while inferencing. The additional computing resources supplied by underlying resource pool in cloud computing can shorten the execution time. This paper addresses an algorithm to allocate the number of computing nodes "elastically" at runtime on Hadoop, depending on the size of knowledge data fed. The model proposed in this paper is composed of the layered architecture: the top layer for applications, the middle layer for distributed parallel inference engine to process the triples, and lower layer for elastic Hadoop and server visualization. System algorithms and test data are analyzed and discussed in this paper. The model hast the benefit that rich legacy Hadoop applications can be run faster on this system without any modification.
Energy ICT convergence with big data services
Choi, Jongwoo ; Lee, Il Woo ;
Journal of the Korean Data and Information Science Society, volume 26, issue 5, 2015, Pages 1141~1154
DOI : 10.7465/jkdi.2015.26.5.1141
This paper describes the convergence of the energy technology and information and communication technology (ICT), which helps to consume less energy effectively. While a lot of researches have done against the increase of world energy usage, most of them focus on the efficiency of energy supply, transfer, and consumption equipment. Applying the ICT to decrease energy usage could help to find energy saving factors in the new field that has not been considered as a valuable one before. The big data service with the energy technology and ICT convergence enables correlation analyses of large sets of energy and environmental data. Finding a data tendency with a big data service helps to develop energy saving policies. Furthermore, it could make a further step to develop a new business model. This paper introduces the real cases of the company and project that provides a big data service with the ICT convergence.
Big data distributed processing system using RHadoop
Shin, Ji Eun ; Jung, Byung Ho ; Lim, Dong Hoon ;
Journal of the Korean Data and Information Science Society, volume 26, issue 5, 2015, Pages 1155~1166
DOI : 10.7465/jkdi.2015.26.5.1155
It is almost impossible to store or analyze big data increasing exponentially with traditional technologies, so Hadoop is a new technology to make that possible. In recent R is using as an engine for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with various data sizes of actual data and simulated data. Experimental results showed our RHadoop system was faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and biglm packages available on bigmemory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.
Study on the social issue sentiment classification using text mining
Kang, Sun-A ; Kim, Yoo Sin ; Choi, Sang Hyun ;
Journal of the Korean Data and Information Science Society, volume 26, issue 5, 2015, Pages 1167~1173
DOI : 10.7465/jkdi.2015.26.5.1167
The development of information and communication technology like SNS, blogs, and bulletin boards, was provided a variety of places where you can express your thoughts and comments and allowing Big Data to grow, many people reveal the opinion of the social issues in SNS such as Twitter. In this study, we would like to pre-built sentimental dictionary about social issues and conduct a sentimental analysis with structured dictionary, to gather opinions on social issues that are created on twitter. The data that I used is "bikini", "nakkomsu" including tweet. As the result of analysis, precision is 61% and F1- score is 74%. This study expect to suggest the standard of dictionary construction allowing you to classify positive/negative opinion on specific social issues.