Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Journal of the Korean Data and Information Science Society
Journal Basic Information
Journal DOI :
Korean Data and Information Science Society
Editor in Chief :
Volume & Issues
Volume 25, Issue 6 - Nov 2014
Volume 25, Issue 5 - Sep 2014
Volume 25, Issue 4 - Jul 2014
Volume 25, Issue 3 - May 2014
Volume 25, Issue 2 - Mar 2014
Volume 25, Issue 1 - Jan 2014
Selecting the target year
Deal price model in Deal-or-No-Deal game
Song, Seolhee ; Ahn, Soohan ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 697~703
DOI : 10.7465/jkdi.2014.25.4.697
Deal-or-No-Deal game is a famous TV show program of NBC, USA, which is composed of 10 stages at most. At each stage from the first and the ninth, a banker suggests a deal price to participants. In this paper, we intend to reveal the banker's deal price model using a constrained linear model and quadratic program. As results, we provide a linear model in relation to the deal price at each stage and then show using simulation data that the deal price is equal to the nearest integer of the value to be obtained by the provided linear model.
Effect of vapocoolant spray and EMLA cream upon DPT vaccination pain in infants
Jang, Gunja ; Jeon, Eunyoung ; Lee, Eunsil ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 705~714
DOI : 10.7465/jkdi.2014.25.4.705
This study aimed at identifying the effect of vapocoolant spray and EMLA (eutectic mixture of local anesthetics) cream upon DPT (diphtheria-pertussis-tetanus)vaccineassociated injection pain in infants. A nonequivalent control group pretest-posttest design was used. The subjects were 49 infants, 19 of them for control group, 15 of them for vapocoolant group, and 15 infants for EMLA group. Pulse and oxygen saturation as pain indicators were measured before and after DPT vaccination. FLACC was also measured after vaccination. The data were collected between October 2009 and June 2010 and analyzed using SPSS WIN 20.0. EMLA group had significant a little changes in pulse (F=43.37, p <.001) and oxygen saturation (F=9.86, p=.003) compared to the control and vapocoolant group. But there was no difference in FLACC pain score among three groups. This results showed that EMLA cream is an effective agent for reducing DPT vaccination-associated pain. Therefore, EMLA cream can be used to reduce pain at public health centers and clinical settings.
Intergenerational economic mobility in Korea using a quantile regression analysis
Richey, Jeremiah ; Jeong, Kiho ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 715~725
DOI : 10.7465/jkdi.2014.25.4.715
This study uses a quantile regression analysis to investigate intergenerational economic mobility in Korea. The analysis is based on data from the 1st through 11th waves of the Korean Labor and Income Panel Study (KLIPS) conducted from 1998-2008. The household nature of the data allows us to link parents' incomes to children's incomes at different points in time. Using a quantile regression analysis instead of mean one reveals that the effect of fathers' earnings are different across the conditional distribution of sons' earnings, particularly being larger on the upper quantile than on the lower quantile. After controlling effect of sons' college education by including a dummy variable for the degree, however, the pattern among quantile effects for fathers' earnings is no longer clear. Instead a new pattern emerges that education has a much larger effect on the upper quantiles than on the lower ones. Using nonparametric estimates of conditional density curves based on the quantile regression results, we derive some interesting features in graphical forms, which are not obvious in numerical analysis.
Analysis of employee's characteristic using data visualization
Cho, Jang Sik ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 727~736
DOI : 10.7465/jkdi.2014.25.4.727
The fundamental concerns of this paper are to analyze the effects of some characteristics on the employment of new college graduated students in viewpoint of data visualization. We use individual and department characteristic data of K-university graduated students in 2010. We apply multiple correspondence analysis, decision tree analysis, association rules and social network analysis for data visualization. The results of the analysis are summarized as follows. First, an analysis of the determinants of employment shows that GPA, department category, age and number of majors, recruiting time affect the employment rate. Second, higher GPA and natural category of department positively affect the employment rate. Finally, low age, single major and early recruiting time also positively affect the employment rate.
Adjustment of heterogeneous variance by milk production level of dairy herd
Cho, Kwang-Hyun ; Lee, Joon-Ho ; Park, Kyung-Do ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 737~743
DOI : 10.7465/jkdi.2014.25.4.737
This experiment was conducted to compare heterogeneity for the variance in dairy cattle population and to induce homogeneity of variance using 502,228 performance test records of dairy cattle. The estimates of heritability for milk yields, fat yields and protein yields were 0.28, 0.26 and 0.24, respectively and the estimate of average breeding value by birth year was lower in HV (heterogenous variance) model than in animal model, collectively. The average breeding values of milk yields, fat yields and protein yields for 545 sire bulls applicable to the criteria of interbull MACE programme were 453.54kg, 10.75kg and 14.33kg, respectively and when the heterogeneity was adjusted they were 432.06kg, 10.15kg and 13.40kg, respectively, which were lower in all milk traits collectively. In animal model, coefficients of phenotypic correlation between dataset I and II were 0.839 in milk yields, 0.821 in fat yields, and 0.837 in protein yields, while in HV model, they were 0.841 in milk yields, 0.820 in fat yields, and 0.836 in protein yields, showing similar results in 2 models. When compared using animal model and HV model, the regression coefficient for ratio of number of daughters by calving year of milk yields increased from 15.157 to 16.105 and that of fat yields increased from =0.227 to =0.196, but that of protein yields decreased from 0.630 to 0.586.
Cooperative effect in space-dependent Parrondo games
Lee, Jiyeon ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 745~753
DOI : 10.7465/jkdi.2014.25.4.745
Parrondo paradox is the counter-intuitive situation where individually losing games can combine to win or individually winning games can combine to lose. In this paper, we compare the history-dependent Parrondo games and the space-dependent Parrondo games played cooperatively by the multiple players. We show that there is a probability region where the history-dependent Parrondo game is a losing game whereas the space-dependent Parrondo game is a winning game.
Exploratory study on the relationship between supply chain performance and ICT capabilities
Oh, Soojung ; Oh, Kwangsik ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 755~767
DOI : 10.7465/jkdi.2014.25.4.755
Recently, many firms have introduced information and communication technology (ICT) into supply chain. However, existing studies have not yet insisted definite conclusion with respect to ICT impact on supply chain. Thus, this study subdivides supply chain performance which previous researchers have studied comprehensively while suggesting perspective of the use of firm's ICT capabilities. We classify ICT capabilities into four types of group and then analyze the difference between groups regarding each factor of supply chain performance by ANOVA analysis and Tukey method. As a result of analysis, the group in which all ICT capabilities are high shows the highest level of integration and flexibility performance among supply chain performances. On the other hand, the group in which all ICT capabilities are low presents the lowest level of integration and flexibility performance. We also provide more precise and specific information with practitioners by analyzing the difference between groups with regard to detailed measurements on integration and flexibility variables.
A study on the invigorating strategies for open government data
Hong, Yeon Woong ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 769~777
DOI : 10.7465/jkdi.2014.25.4.769
Recently many countries have established open government data platforms to disclose government or government controlled entities-owned data that can be freely used, reused and redistributed by anyone. Open government data can help you to make better decisions in your own life, or enable you to be more active in society. Open data is also making government more effective and transparent, which ultimately also reduces costs. This paper explains the open data concepts and circumstances in Korea, and also suggests detailed invigorating strategies such as data quality policy, data unification and standardization policy, open data service platform, and integrated support plan of big data and open government data.
A case study on programming academic achievement: Focused on the hardware curriculum
Lee, Seung-Woo ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 779~790
DOI : 10.7465/jkdi.2014.25.4.779
The purpose of this study is to find the programming capability for the students majoring in H/W. For implementing this purpose, first, the academic achievements on the C language and C++ language are measured for the graduates-to-be majoring in H/W and S/W. Second, the H/W and S/W curriculum are compared and analyzed to derive the relevant factors to give influence on the academic achievement of the programming. Third, to find the influence of mathematic competence on the academic achievement of the programming, the relevance is analyzed in terms of the regression analyses between mathematics curriculum and programing curriculum. This paper presents the effective teaching method for the improvement of the programming academic achievement in the H/W curriculum.
Modified Wu and Clements-Croome's PM model
Jung, Ki Mun ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 791~798
DOI : 10.7465/jkdi.2014.25.4.791
Wu and Clements-Croome (2005) suggest the preventive maintenance (PM) model with random maintenance quality. They assume that each PM resets the failure rate to zero and the rate of increases of the failure rate gets higher after each additional PM. However a system may not be restored to as good as new immediately after the completion of PM. Thus, this paper modifies the Wu and Clements-Croome's PM model and then the optimal PM policy is suggested. To determine the optimal PM policy, we utilize the expected cost rate per unit time for our model. That is, we obtain the optimal number and the optimal period by minimizing the expected cost rate per unit time. The numerical examples are presented for illustrative purpose.
Type III sums of squares by projections
Choi, Jaesung ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 799~805
DOI : 10.7465/jkdi.2014.25.4.799
This paper deals with a method for getting the Type III sums of squares on the basis of projections under the assumption of two-way fixed effects model. For unbalanced data in general total sum of squares is not equal to the sum of componentwise Type III sums of squares. There are some differencies between two quantities. The suggested method using projections can detect where the differences occur and how much they are different. The traditional ANOVA method could not explain clearly the differences. It also discusses how eigenvectors and eigenvalues of the projection matrices can be used to get the Type III sums of squares.
The sparse vector autoregressive model for PM10 in Korea
Lee, Wonseok ; Baek, Changryong ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 807~817
DOI : 10.7465/jkdi.2014.25.4.807
This paper considers multivariate time series modelling of PM10 data in Korea collected from 2008 to 2011. We consider both temporal and spatial dependencies of PM10 by applying the sparse vector autoregressive (sVAR) modelling proposed by Davis et al. (2013). It utilizes the partial spectral coherence to measure cross correlation between different regions, in turn provides the sparsity in the model while balancing the parsimony of model and the goodness of fit. It is also shown that sVAR performs better than usual vector autoregressive model (VAR) in forecasting.
Feature analysis of deaf students' English language by frequency
Lee, Gun-Min ; Park, Hye Jung ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 819~828
DOI : 10.7465/jkdi.2014.25.4.819
In this paper, we analyze the characteristics of the English vocalization of deaf students and present the basic data for the development of personalized English learning aid tools that reflect its features. We visited hearing special schools in Seoul and Daegu and recorded English vocalization of the deaf students in order to analyze the characteristics of deaf students' English vocalization. We analyzed the data by Praat program, an professional voice analysis program. The voice features of deaf students' English vocalization were extracted and then compared with those of non-deaf students' English vocalization.
A study of the factors influential on a health-related quality of life using complex sample design
Park, Cheolyong ; Choi, Hyun Seok ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 829~846
DOI : 10.7465/jkdi.2014.25.4.829
Using complex sample design, this article analyzes the difference between sex and age groups of mental health, physical activity, suffering lifestyle diseases, drinking, and smoking using the fifth Korea national health and nutrition examination survey data (2011-2012), and then analyzes the effect of mental health, physical activity, suffering lifestyle diseases, drinking, and smoking on EQ-5D, a measure of health-related quality of life. The results show that mental health, physical activity, suffering lifestyle diseases, drinking and smoking are statistically different among gender and age groups, and that age group, education level, suffering diabetes, recognizing stress, thinking suicide are statistically influential on EQ-5D.
The study of changes in performance in KLPGA using growth curve analysis
Kim, Nam Jin ; Min, Dae Kee ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 847~855
DOI : 10.7465/jkdi.2014.25.4.847
In recent years, women's monetary rewards in golf increased and their performances have improved significantly compared to other sports. Sports marketing has become more active in Asia and the number of Korean players in LPGA with good scores are increasing. For these reasons, golf is becoming increasingly popular. The prize money is higher than in other sports and the economic benefits are increasing due to the financial incentives such as sponsorships. Many of these prospects actively affect women's golf. Certain rookies continue to increase and their performances improve day by day. In this study, I analyze the changes in performance over time of last 5 years from 2009 using growth curve analysis. According to the results of analysis, driving distance and average putting skills developed but green in regulation decreased.
Proposition of causally confirmed measures in association rule mining
Park, Hee Chang ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 857~868
DOI : 10.7465/jkdi.2014.25.4.857
Data mining is the representative analysis methodology in the era of big data, and is the process to analyze a massive volume database and summarize it into meaningful information. Association rule technique finds the relationship among several items in huge database using the interestingness measures such as support, confidence, lift, etc. But these interestingness measures cannot be used to establish a causality relationship between antecedent and consequent item sets. Moreover, we can not know association direction by them. This paper propose causally confirmed association thresholds to compensate for these problems, and then check the three conditions of interestingness measures. The comparative studies with basic association thresholds, causal association thresholds, and causally confirmed association thresholds are shown by simulation studies. The results show that causally confirmed association thresholds are better than basic and causal association thresholds.
Effective education plan of probability and statistics in the H/W curriculum
Lee, Seung-Woo ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 869~880
DOI : 10.7465/jkdi.2014.25.4.869
This study aims at presenting the educational model for the effective application of probability and statistics to the H/W curriculum. In order to do this, this paper conducts a survey with H/W major college students, and then analyzes how probability and statistics can be correlated with other H/W core subjects and how the knowledge of probability and statistics can affect the understanding of H/W majors through the actual class experiment. Consequently this study suggests probability and statistics as a prerequisite subject in the H/W curriculum.
Comparison of model selection criteria in graphical LASSO
Ahn, Hyeongseok ; Park, Changyi ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 881~891
DOI : 10.7465/jkdi.2014.25.4.881
Graphical models can be used as an intuitive tool for modeling a complex stochastic system with a large number of variables related each other because the conditional independence between random variables can be visualized as a network. Graphical least absolute shrinkage and selection operator (LASSO) is considered to be effective in avoiding overfitting in the estimation of Gaussian graphical models for high dimensional data. In this paper, we consider the model selection problem in graphical LASSO. Particularly, we compare various model selection criteria via simulations and analyze a real financial data set.
Reliability analysis of warranty returns data
Baik, Jaiwook ; Jo, Jinnam ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 893~901
DOI : 10.7465/jkdi.2014.25.4.893
A certain number of products are sold each month and some of them are returned for repair. In this study both return rate and cumulative return rate are shown on the graph to show the general trend of how many products are returned as time goes by. Next this type of summary data can be considered as a conglomeration of both left and right censored data. So reliability analysis is attempted for this type of summary data. Lastly, left censored data can be traced to find the exact time period during which the product has been claimed. In that case the left censored data can be taken as failure data. So similar type of reliability analysis is attempted for the resulting right censored data.
Goodness-of-fit tests for the inverse Weibull or extreme value distribution based on multiply type-II censored samples
Kang, Suk-Bok ; Han, Jun-Tae ; Seo, Yeon-Ju ; Jeong, Jina ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 903~914
DOI : 10.7465/jkdi.2014.25.4.903
The inverse Weibull distribution has been proposed as a model in the analysis of life testing data. Also, inverse Weibull distribution has been recently derived as a suitable model to describe degradation phenomena of mechanical components such as the dynamic components (pistons, crankshaft, etc.) of diesel engines. In this paper, we derive the approximate maximum likelihood estimators of the scale parameter and the shape parameter in the inverse Weibull distribution under multiply type-II censoring. We also develop four modified empirical distribution function (EDF) type tests for the inverse Weibull or extreme value distribution based on multiply type-II censored samples. We also propose modified normalized sample Lorenz curve plot and new test statistic.
Stationary analysis of the surplus process in a risk model with investments
Lee, Eui Yong ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 915~920
DOI : 10.7465/jkdi.2014.25.4.915
We consider a continuous time surplus process with investments the sizes of which are independent and identically distributed. It is assumed that an investment of the surplus to other business is made, if and only if the surplus reaches a given sufficient level. We establish an integro-differential equation for the distribution function of the surplus and solve the equation to obtain the moment generating function for the stationary distribution of the surplus. As a consequence, we obtain the first and second moments of the level of the surplus in an infinite horizon.
Estimation for a bivariate survival model based on exponential distributions with a location parameter
Hong, Yeon Woong ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 921~929
DOI : 10.7465/jkdi.2014.25.4.921
A bivariate exponential distribution with a location parameter is proposed as a model for a two-component shared load system with a guarantee time. Some statistical properties of the proposed model are investigated. The maximum likelihood estimators and uniformly minimum variance unbiased estimators of the parameters, mean time to failure, and the reliability function of system are obtained with unknown guarantee time. Simulation studies are given to illustrate the results.
Support vector expectile regression using IRWLS procedure
Choi, Kook-Lyeol ; Shim, Jooyong ; Seok, Kyungha ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 931~939
DOI : 10.7465/jkdi.2014.25.4.931
In this paper we propose the iteratively reweighted least squares procedure to solve the quadratic programming problem of support vector expectile regression with an asymmetrically weighted squares loss function. The proposed procedure enables us to select the appropriate hyperparameters easily by using the generalized cross validation function. Through numerical studies on the artificial and the real data sets we show the effectiveness of the proposed method on the estimation performances.
Estimating small area proportions with kernel logistic regressions models
Shim, Jooyong ; Hwang, Changha ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 941~949
DOI : 10.7465/jkdi.2014.25.4.941
Unit level logistic regression model with mixed effects has been used for estimating small area proportions, which treats the spatial effects as random effects and assumes linearity between the logistic link and the covariates. However, when the functional form of the relationship between the logistic link and the covariates is not linear, it may lead to biased estimators of the small area proportions. In this paper, we relax the linearity assumption and propose two types of kernel-based logistic regression models for estimating small area proportions. We also demonstrate the efficiency of our propose models using simulated data and real data.
A note on nonparametric density deconvolution by weighted kernel estimators
Lee, Sungho ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 951~959
DOI : 10.7465/jkdi.2014.25.4.951
Recently Hazelton and Turlach (2009) proposed a weighted kernel density estimator for the deconvolution problem. In the case of Gaussian kernels and measurement error, they argued that the weighted kernel density estimator is a competitive estimator over the classical deconvolution kernel estimator. In this paper we consider weighted kernel density estimators when sample observations are contaminated by double exponentially distributed errors. The performance of the weighted kernel density estimators is compared over the classical deconvolution kernel estimator and the kernel density estimator based on the support vector regression method by means of a simulation study. The weighted density estimator with the Gaussian kernel shows numerical instability in practical implementation of optimization function. However the weighted density estimates with the double exponential kernel has very similar patterns to the classical kernel density estimates in the simulations, but the shape is less satisfactory than the classical kernel density estimator with the Gaussian kernel.
Default Bayesian testing for the equality of the scale parameters of several inverted exponential distributions
Kang, Sang Gil ; Kim, Dal Ho ; Lee, Woo Dong ;
Journal of the Korean Data and Information Science Society, volume 25, issue 4, 2014, Pages 961~970
DOI : 10.7465/jkdi.2014.25.4.961
This article deals with the problem of testing the equality of the scale parameters of several inverted exponential distributions. We propose Bayesian hypothesis testing procedures for the equality of the scale parameters under the noninformative prior. The noninformative prior is usually improper which yields a calibration problem that makes the Bayes factor to be defined up to a multiplicative constant. So we propose the default Bayesian hypothesis testing procedures based on the fractional Bayes factor and the intrinsic Bayes factors under the reference priors. Simulation study and an example are provided.