Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Journal of the Korean Data and Information Science Society
Journal Basic Information
Journal DOI :
Korean Data and Information Science Society
Editor in Chief :
Volume & Issues
Volume 26, Issue 6 - Nov 2015
Volume 26, Issue 5 - Sep 2015
Volume 26, Issue 4 - Jul 2015
Volume 26, Issue 3 - May 2015
Volume 26, Issue 2 - Mar 2015
Volume 26, Issue 1 - Jan 2015
Selecting the target year
Development of statistical forecast model for PM10 concentration over Seoul
Sohn, Keon Tae ; Kim, Dahong ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 289~299
DOI : 10.7465/jkdi.2015.26.2.289
The objective of the present study is to develop statistical quantitative forecast model for PM10 concentration over Seoul. We used three types of data (weather observation data in Korea, the China's weather observation data collected by GTS, and air quality numerical model forecasts). To apply the daily forecast system, hourly data are converted to daily data and then lagging was performed. The potential predictors were selected based on correlation analysis and multicollinearity check. Model validation has been performed for checking model stability. We applied two models (multiple regression model and threshold regression model) separately. The two models were compared based on the scatter plot of forecasts and observations, time series plots, RMSE, skill scores. As a result, a threshold regression model performs better than multiple regression model in high PM10 concentration cases.
Development of epidemic model using the stochastic method
Ryu, Soorack ; Choi, Boseung ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 301~312
DOI : 10.7465/jkdi.2015.26.2.301
The purpose of this paper is to establish the epidemic model to explain the process of disease spread. The process of disease spread can be classified into two types: deterministic process and stochastic process. Most studies supposed that the process follows the deterministic process and established the model using the ordinary differential equation. In this article, we try to build the disease spread prediction model based on the SIR (Suspectible - Infectious - Recovered) model. we first estimated the model parameters using least squared method and applied to a deterministic model using ordinary differential equation. we also applied to a stochastic model based on Gillespie algorithm. The methods introduced in this paper are applied to the data on the number of cases of malaria every week from January 2001 to March 2003, released by Korea Centers for Disease Control and Prevention. As a result, we conclude that our model explains well the process of disease spread.
Maximum likelihood estimation for a mixture distribution
Hwang, Seonyeong ; Sohn, Seung Hye ; Oh, Changhyuck ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 313~322
DOI : 10.7465/jkdi.2015.26.2.313
A mixture distribution of a discrete uniform or degenerated distribution and two binomial distribution is proposed and a method of obtaining the maximum likelihood estimates of the parameters is provided. For the proposed model simulation studies were conducted to see performance of the maximum likelihood estimates and a mixture of a degenerated distribution and two binomial distributions was applied to fit a lecture evaluation data to show a good result.
Optimal portfolio and VaR of KOSPI200 using One-factor model
Ko, Kwang Yee ; Son, Young Sook ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 323~334
DOI : 10.7465/jkdi.2015.26.2.323
he current VaR model based on the J.P. Morgan's RiskMetrics structurally can not reflect the future economic situation. In this study, we propose a One-factor model resulting from the Wiener stochastic process decomposed into a systematic risk factor and an idiosyncratic risk factor. Therefore, we are able to perform a preemptive risk management by means of reflecting the predicted common risk factors in the model. Stocks in the portfolio are satisfied with the independence to each other because the common factors are fixed by the predicted value. Therefore, we can easily determine the investment in each stock to minimize the variance of the portfolio. In addition, the portfolio VaR is decomposed into the sum of the individual VaR. So we can effectively implement the constitution of the portfolio to meet the target maximum losses.
The comparison of coauthor networks of two statistical journals of the Korean Statistical Society using social network analysis
Chun, Heuiju ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 335~346
DOI : 10.7465/jkdi.2015.26.2.335
The purpose of this study is to compare not only network influence of individual coauthor but also the types and properties of two coauthor networks of Communications for Statistical Applications and Methods and the Korean Journal of Applied Statistics which are published by the Korean Statistical Society using social network analysis.As the result of two network structure comparison, density, inclusiveness, reciprocity and clustering coefficient which represent the type of coauthor networks show almost similar values and the Korean Journal of Applied Statistics has bigger values in average degree, average distance and diameter because it has more nodes than Communications for Statistical Applications and Methods. Finally two journals have very similar type of coauthor network. In the comparison of network centrality of two coauthor networks, closeness centrality and betweenness centrality of the Korean Journal of Applied Statistics are bigger than those of Communications for Statistical Applications and Methods at the statistical significance level 0.05. The coauthor network of the Korean Journal of Applied Statistics has faster information delivery and stronger betweenness than that of Communications for Statistical Applications.
Projection analysis for balanced incomplete block designs
Choi, Jaesung ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 347~354
DOI : 10.7465/jkdi.2015.26.2.347
This paper deals with a method for intrablock anlaysis of balanced incomplete block designs on the basis of projections under the assumption of mixed effects model. It shows how to construct a model at each step by the stepwise procedure and discusses how to use projection for the analysis of intrablock. Projections are obtained in vector subspaces orthogonal to each other. So the estimates of the treatment effects are not affected by the block effects. The estimability of a parameter or a function of parameters is discussed and eigenvectors are dealt for the construction of estimable functions.
A meta analysis of the climate change impact on rice yield in South Korea
Shin, Deok Ha ; Lee, Mun Su ; Park, Ju-Hyun ; Lee, Yung-Seop ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 355~365
DOI : 10.7465/jkdi.2015.26.2.355
As the global climate has dramatically changed over the past decades, there has been active research on evaluating its effects on food security, which has been recognized as one of the most important issues in the field. In this study, we analyzed the impact of the climate change on the Korean agriculture using meta-analysis methods. Especially, our research focus is on estimating the effect of CO2 concentration and two adaptations (planting-date and cultivar adjustments)on rice that accounts for a larger proportion of the Korean domestic agriculture. Unlike traditional general meta-analysis methods that use summary statistics of effects of interest, meta analysis specific to the agriculture literature was conducted by integrating the data on rice yield that were generated under various CO2 emission scenarios and general circulating models of the 6 collected individual studies. As a modeling approach, the rice yield change ratio was set as the dependent variable and the main and interaction effects of CO2 concentration and adaptation were considered as independent variables in a regression model, As a result, CO2 is estimated to have opposite effects on rice yield depending on whether any of the two adaptations is applied or not; decreasing effect without adaptation and increasing effect with adaptation. In addition, it turns out that the cultivar adjustment has a higher increasing effect on rice yield than the planting-date adjustment. The results of the study are expected to be used as basic quantitative data for establishing responsive polices to the future climate changes.
A study on the ordering of PIM family similarity measures without marginal probability
Park, Hee Chang ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 367~376
DOI : 10.7465/jkdi.2015.26.2.367
Today, big data has become a hot keyword in that big data may be defined as collection of data sets so huge and complex that it becomes difficult to process by traditional methods. Clustering method is to identify the information in a big database by assigning a set of objects into the clusters so that the objects in the same cluster are more similar to each other clusters. The similarity measures being used in the cluster analysis may be classified into various types depending on the nature of the data. In this paper, we computed upper and lower limits for probability interestingness measure based similarity measures without marginal probability such as Yule I and II, Michael, Digby, Baulieu, and Dispersion measure. And we compared these measures by real data and simulated experiment. By Warrens (2008), Coefficients with the same quantities in the numerator and denominator, that are bounded, and are close to each other in the ordering, are likely to be more similar. Thus, results on bounds provide means of classifying various measures. Also, knowing which coefficients are similar provides insight into the stability of a given algorithm.
Estimation to improve survey efficiency in callback
Park, Hyeonah ; Na, Seongryong ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 377~385
DOI : 10.7465/jkdi.2015.26.2.377
After performing callback for nonresponses in sample survey, we present an estimator of regression form using an auxiliary variable and a variance estimator using replicate method. Parametric inference method of the response probability is also presented. We research an unbiased estimator of high efficiency for the population mean and a variance estimator with consistency under callback. We also prove the validity of the theory through the simulation.
The study on the determinants of the number of job changes
Park, Sungik ; Ryu, Jangsoo ; Kim, Jonghan ; Cho, Jangsik ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 387~397
DOI : 10.7465/jkdi.2015.26.2.387
In this paper, the determinants of the number of job changes in the SMEs (small and medium enterprises) youth-intern project is analysed, utilizing SMEs youth-intern DB and employment insurance DB. Since the number of job changes are count data which take integer values other than negative values, general linear regression analysis becomes inappropriate. Therefore, four models such as Poisson regression model, zero inflated Poisson regression model, negative binomial regression model and zero inflated negative binomial regression model are tried to fit count data. A zero inflated negative binomial regression model is selected to be the best model. Major results are the followings. First, the number of job changes is shown to be significantly smaller in the treatment group than in the control group. Second, the number of job changes turns out to be significantly smaller in the young-age group than in the old-age group. Third, it is also shown that the number of job changes of man is significantly greater than that of woman. Lastly, the number of job changes in the bigger firm is shown to be significantly less than that of the smaller firm.
Data visualization of airquality data using R software
Oh, Youngchang ; Park, Eunsik ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 399~408
DOI : 10.7465/jkdi.2015.26.2.399
This paper presented airquality data through data visualization in several ways and described its characteristics related to statistical methods for analysis. Software R was used for visualization tools. The airquality data was measured in New York city from May to September of year 1973. First, simple, exploratory data analysis was done in terms of both data visualization and analysis to find out univariate characteristics. Then through data transformation and multiple regression analysis, model for describing the airquality level was found. Also, after some data categorization, overall feature of the data was explored using box plot and three-dimensional perspective drawing and scatter plot.
Implementation of smart chungbuk tourism based on SNS data analysis
Cho, Wan-Sup ; Cho, Ah ; Kwon, Kaaen ; Yoo, Kwan-Hee ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 409~418
DOI : 10.7465/jkdi.2015.26.2.409
With the development of mobile devices and Internet, information exchange has actively been made through SNS and Blogs. Blogs are widely used as a space where people share their experience after their visit to tourist attractions. We propose a method of recommending associated tourist attractions based on tourists' opinions using issue analysis, association analysis, and sentimental analysis for various online reviews including news in order to help to develop tour products and policies. The result shows that north area of Chungbuk province has been selected as issue attractions, and associated attractions/keywards have been identified for given well-known attraction. Positive/negative opinion for review texts has been analyzed and user can grasp the reason for the sentiments. Multidimensional analysis technique has been integrated to derive additional sophisticated insights and various policy proposal for smart tourism.
Text mining on internet-news regarding climate change and food
Hyun, Yoonjin ; Kim, Jeong Seon ; Jeong, Jin-Wook ; Yun, Simon ; Lee, Moon-Soo ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 419~427
DOI : 10.7465/jkdi.2015.26.2.419
Despite of correlation between climate changes and food-related information, it is still not easy for many users to get access to the information with interest. This study investigated how much climate change and food-related information are correlated with each other and how often they are exposed through frequency and correlation analysis on news articles on the internet portals. Through analysis on the frequency of climate change and food-related news articles, this study was able to figure out how often they are exposed at the same time by the internet news portals. In addition, a total of 59 correlation rules regarding the climate change and food-related vocabularies were derived from these news articles using the climate change and food-related glossaries. Then, a correlation between certain climate change-related and food-related words was analyzed in order to package the related words.
Gender differences in factors influencing the school adjustment by BMI
Seo, Ji Yeong ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 429~440
DOI : 10.7465/jkdi.2015.26.2.429
This study was to investigate factors influencing the school adjustment according to gender and body mass index (BMI) of middle school students who participated in the 2nd-wave Korea Children and Youth Panel Study (KCYPS). This study used a crosssectional design with secondary analysis with KCYPS. The variables were parental interest, behavioral problem, aggression, attention problem, somatic symptom, social withdrawal, depression, and academic achievement. The data were analyzed with descriptive statistics, Pearson's correlation coefficients, and multiple regressions. School adjustment was significantly associated with academic achievement high, explaining 11.3~19.1% of the variance in boys. School adjustment was significantly associated with attention problem, explaining 14.9~42.4% of the variance in girls. Factors influencing school adjustment were significantly different according to gender and BMI. To improve the school adjustment, it is necessary to develop gender-specific school adjustment promotion programs according to BMI.
Integrative literature review of nursing performance in Korea
Cho, Sumi ; Lee, Eunjoo ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 441~453
DOI : 10.7465/jkdi.2015.26.2.441
This paper review nursing performance in Korea to suggest the direction of future research. A literature review of research nursing performance journal articles was performed. Thirty-three articles were analyzed research design, measurement tools, definition, and variables which with nursing performance. Most of the on nursing performance in Korea descriptive correlation study. There considerable confusion about Korean term and definitions nursing performance. used the instrument developed by Park (1989) to measure nursing performance. The variables with nursing performance leadership and empowerment. No agreement on definitions or concepts of nursing performance evident in Korea. Research on nursing performance that addresses all domains of nursing is needed to analyze the impact of nursing performance on patient and healthcare outcomes.
Influencing factors on health education performance of nurse in health promoting hospitals
Lee, Jinsook ; Kwon, Sohi ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 455~464
DOI : 10.7465/jkdi.2015.26.2.455
This study aimed to identify the factors influencing health education performance of health promoting hospital nurses. The study was conducted with 231 nurses from four health promoting hospitals. Data were collected from May to June, 2013. Health education performance was positively correlated with education level, years of clinical experience, health promotion role recognition, and self efficacy for health education. Health promotion role recognition (
, p=.001), self-efficacy for health education (
, p <.001), and clinical experiences (
, p=.007) were significant predictors of health promoting hospital nurses' health education performance and explained 27.8% of the variance. The strategies to improve health promotion role recognition and self-efficacy for health education should be developed to improve health education performance of health promoting hospital nurses.
The local influence of LIU type estimator in linear mixed model
Zhang, Lili ; Baek, Jangsun ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 465~474
DOI : 10.7465/jkdi.2015.26.2.465
In this paper, we study the local influence analysis of LIU type estimator in the linear mixed models. Using the method proposed by Shi (1997), the local influence of LIU type estimator in three disturbance models are investigated respectively. Furthermore, we give the generalized Cook's distance to assess the influence, and illustrate the efficiency of the proposed method by example.
Cross platform classification of microarrays by rank comparison
Lee, Sunho ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 475~486
DOI : 10.7465/jkdi.2015.26.2.475
Mining the microarray data accumulated in the public data repositories can save experimental cost and time and provide valuable biomedical information. Big data analysis pooling multiple data sets increases statistical power, improves the reliability of the results, and reduces the specific bias of the individual study. However, integrating several data sets from different studies is needed to deal with many problems. In this study, I limited the focus to the cross platform classification that the platform of a testing sample is different from the platform of a training set, and suggested a simple classification method based on rank. This method is compared with the diagonal linear discriminant analysis, k nearest neighbor method and support vector machine using the cross platform real example data sets of two cancers.
Robust Bayesian analysis for autoregressive models
Ryu, Hyunnam ; Kim, Dal Ho ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 487~493
DOI : 10.7465/jkdi.2015.26.2.487
Time series data sometimes show violation of normal assumptions. For cases where the assumption of normality is untenable, more exible models can be adopted to accommodate heavy tails. The exponential power distribution (EPD) is considered as possible candidate for errors of time series model that may show violation of normal assumption. Besides, the use of exible models for errors like EPD might be able to conduct the robust analysis. In this paper, we especially consider EPD as the exible distribution for errors of autoregressive models. Also, we represent this distribution as scale mixture of uniform and this form enables efficient Bayesian estimation via Markov chain Monte Carlo (MCMC) methods.
Review on statistical methods for large spatial Gaussian data
Park, Jincheol ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 495~504
DOI : 10.7465/jkdi.2015.26.2.495
The Gaussian geostatistical model has been widely used for modeling spatial data. However, this model suffers from a severe difficulty in computation because inference requires to invert a large covariance matrix in evaluating log-likelihood. In addressing this computational challenge, three strategies have been employed: likelihood approximation, lower dimensional space approximation, and Markov random field approximation. In this paper, we reviewed statistical approaches attacking the computational challenge. As an illustration, we also applied integrated nested Laplace approximation (INLA) technology, one of Markov approximation approach, to real data to provide an example of its use in practice dealing with large spatial data.
A note on standardization in penalized regressions
Lee, Sangin ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 505~516
DOI : 10.7465/jkdi.2015.26.2.505
We consider sparse high-dimensional linear regression models. Penalized regressions have been used as effective methods for variable selection and estimation in high-dimensional models. In penalized regressions, it is common practice to standardize variables before fitting a penalized model and then fit a penalized model with standardized variables. Finally, the estimated coefficients from a penalized model are recovered to the scale on original variables. However, these procedures produce a slightly different solution compared to the corresponding original penalized problem. In this paper, we investigate issues on the standardization of variables in penalized regressions and formulate the definition of the standardized penalized estimator. In addition, we compare the original penalized estimator with the standardized penalized estimator through simulation studies and real data analysis.
Semisupervised support vector quantile regression
Seok, Kyungha ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 517~524
DOI : 10.7465/jkdi.2015.26.2.517
Unlabeled examples are easier and less expensive to be obtained than labeled examples. In this paper semisupervised approach is used to utilize such examples in an effort to enhance the predictive performance of nonlinear quantile regression problems. We propose a semisupervised quantile regression method named semisupervised support vector quantile regression, which is based on support vector machine. A generalized approximate cross validation method is used to choose the hyper-parameters that affect the performance of estimator. The experimental results confirm the successful performance of the proposed S2SVQR.
A change point estimator in monitoring the parameters of a multivariate IMA(1, 1) model
Sohn, Sun-Yoel ; Cho, Gyo-Young ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 525~533
DOI : 10.7465/jkdi.2015.26.2.525
Modern production process is a very complex structure combined observations which are correlated with several factors. When the error signal occurs in the process, it is very difficult to know the root causes of an out-of-control signal because of insufficient information. However, if we know the time of the change, the system can be controlled more easily. To know it, we derive a maximum likelihood estimator (MLE) of the change point in a process when observations are from a multivariate IMA(1,1) process by monitoring residual vectors of the model. In this paper, numerical results show that the MLE of change point is effective in detecting changes in a process.
Estimation for scale parameter of type-I extreme value distribution
Choi, Byungjin ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 535~545
DOI : 10.7465/jkdi.2015.26.2.535
In a various range of applications including hydrology, the type-I extreme value distribution has been extensively used as a probabilistic model for analyzing extreme events. In this paper, we introduce methods for estimating the scale parameter of the type-I extreme value distribution. A simulation study is performed to compare the estimators in terms of mean-squared error and bias, and the obtained results are provided.
Erratum to "Comparative analysis of Bayesian and maximum likelihood estimators in change point problems with Poisson process"
Kitabo, Cheru Atsmegiorgis ; Kim, Jong Tae ;
Journal of the Korean Data and Information Science Society, volume 26, issue 2, 2015, Pages 547~547
DOI : 10.7465/jkdi.2015.26.2.547