Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Journal of the Korean Data and Information Science Society
Journal Basic Information
Journal DOI :
Korean Data and Information Science Society
Editor in Chief :
Volume & Issues
Volume 24, Issue 6 - Nov 2013
Volume 24, Issue 5 - Sep 2013
Volume 24, Issue 4 - Jul 2013
Volume 24, Issue 3 - May 2013
Volume 24, Issue 2 - Mar 2013
Volume 24, Issue 1 - Jan 2013
Selecting the target year
A study on the analysis of customer loan for the credit finance company using classification model
Kim, Tae-Hyung ; Kim, Yeong-Hwa ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 411~425
DOI : 10.7465/jkdi.2013.24.3.411
The importance and necessity of the credit loan are increasing over time. Also, it is a natural consequence that the increase of the risk for borrower increases the risk of non-performing loan. Thus, we need to predict accurately in order to prevent the loss of a credit loan company. Our final goal is to build reliable and accurate prediction model, so we proceed the following steps: At first, we can get an appropriate sample by using several resampling methods. Second, we can consider variety models and tools to fit our resampling data. Finally, in order to find the best model for our real data, various models were compared and assessed.
Enhancing the corporate image through social media: An approach based on multi-dimensional scaling
Kim, Suhyun ; Lee, Hanjun ; Suh, Yongmoo ; Han, Jinyoung ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 427~436
DOI : 10.7465/jkdi.2013.24.3.427
Social media is drawing attention among companies for its potential as a marketing tool. There are many types of social media and their characteristics are varied, and thus choosing the appropriate social media considering the purpose of the company is important. In this paper, we conduct comparative analysis on the popular social media such as Facebook, Twitter, Naver blog, Youtube, Cyworld and Me2day using multidimensional scaling method. The result shows that there are differences in the effectiveness of enhancing diverse dimensions of corporate image among social media. This result can be used in developing social media based marketing strategy.
A study on distribution comparison of response packets for major portal sites
Ryu, Gui-Yeol ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 437~444
DOI : 10.7465/jkdi.2013.24.3.437
The object of study is to verify the distributions of response packets of 3 portal sites such as Naver, Daum, Nate. The period of experiments is from May 19th 2010 to November 7th 2012 and the number of experiments is 4,642. The distributions of Naver, Nate are biomodals. The distribution of Daum has long right tails. 3 distributions are different under 1% significance level using chi-square test and two sample Kolmogorov-Smirnov test. From proportions and percentiles, Naver has a distribution with the largest values. Nate is the second place, and Daum has a distribution with the smallest values. We must make portal pages light to increase response speed including other technologies. We expect our results to activate competition among portal sites.
A study on effects of limited replacements in exponential model
Cho, Kil-Ho ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 445~451
DOI : 10.7465/jkdi.2013.24.3.445
We consider the estimators for the parameters of the exponential model with limited replacements under the type I censoring scheme. Also, we propose the desirable number of replacements to provide the similar effects in terms of the mean square errors.
Structural relationships among achievement goal orientation, self-leadership and sport motivation of skating athletes
Nam, Jung Hoon ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 453~464
DOI : 10.7465/jkdi.2013.24.3.453
This study was to verify the structural relationships among achievement goal orientation, self-leadership and sport motivation of skating athletes. The data were collected out of skating athletes in Seoul and Gyonggi area, and total 369 copies of data were utilized in this study. To verify the relationships among achievement goal orientation, self-leadership and sport motivation, the construct validity and reliability for each factor were analyzed using SPSS18.0 and AMOS18.0 program. And the relationships among three factors were analyzed using covariance structural analysis. The results were as follows. First, the achievement goal orientation of skating athletes had positive effects on self-leadership. Second, the achievement goal orientation of skating athletes had positive effects on sport motivation. Third, the self-leadership of skating athletes had positive effects on sport motivation.
On the asymptotic correlationship for some process capability indices Ĉ
Cho, Joong-Jae ; Yu, Hye-Kyung ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 465~475
DOI : 10.7465/jkdi.2013.24.3.465
Higher quality level is generally perceived by customers as improved performance by assigning a correspondingly higher satisfaction score. Usually, the quality level is measured by process capability indices. The index is used to determine whether a production process is capable of producing items within a specified tolerance. Some useful process capability indices
have been widely used in six sigma industries to assess process performance. Most evaluations on process capability indices focus on point estimates, which may result in unreliable assessments of process performance. It is necessary to investigate their asymptotic correlationship among process capability indices
. In this paper, we study their asymptotic correlationship for some process capability indices
under the normal process.
An implementation of sample size and power calculations in testing differences of normal means
Sim, Songyong ; Choi, Kyuhyeok ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 477~485
DOI : 10.7465/jkdi.2013.24.3.477
In this paper, we consider the sample sizes required for each group in independent two sample test of normal populations when both the type I error and type II error probabilities are specified with sample sizes and variances being possibly different. We derived the sample sizes and the power of the tests, and implement them by web programing. The result is available over the world wide web. Further, we also provide the power calculations and have them available on the web.
Estimation of genetic parameters for milk flow traits in Holstein dairy cattle
Cho, Kwang-Hyun ; Lee, Hak-Kyo ; Lee, Joon-Ho ; Park, Kyung-Do ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 487~493
DOI : 10.7465/jkdi.2013.24.3.487
This experiment was conducted to investigate the possibility that milking speed traits can be improved by estimating their genetic parameters and to provide basic information when the goals for dairy cattle improvement are established. The amount of milk within the first three minutes (3MG) was 8.97 Kg and 57% of total milk was produced within 3 minutes, but it was lower than that of the recommended level (70%). The highest milk flow (HMF) and average milk flow (DMHG) in the main milking phase were 3.66kg/min and 2.43kg/min, respectively, which were lower than those of the recommended levels (4.0 5.0kg/min and 3.0 4.0kg/min), suggesting slower milking speed of domestic dairy cattle compared to that of foreign dairy cattle. The heritability estimates on the highest milk flow (HMF), maximum milk flow (HMG) in one minute and average milk flow (DMHG) in the main milking phase were 0.35, 0.31 and 0.29, respectively, which are suitable for the improvement of traits with medium heritability. The genetic correlation between total milk yields (MGG) and average milk flow (DMHG) in the main milking phase was 0.591, while the genetic correlations among milking speed traits including the highest milk flow (HMF), maximum milk flow (HMG) in one minute and average milk flow (DMHG) in the main milking phase were in the range of 0.889 0.997.
Generating of Pareto frontiers using machine learning
Yun, Yeboon ; Jung, Nayoung ; Yoon, Min ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 495~504
DOI : 10.7465/jkdi.2013.24.3.495
Evolutionary algorithms have been applied to multi-objective optimization problems by approximation methods using computational intelligence. Those methods have been improved gradually in order to generate more exactly many approximate Pareto optimal solutions. The paper introduces a new method using support vector machine to find an approximate Pareto frontier in multi-objective optimization problems. Moreover, this paper applies an evolutionary algorithm to the proposed method in order to generate more exactly approximate Pareto frontiers. Then a decision making with two or three objective functions can be easily performed on the basis of visualized Pareto frontiers by the proposed method. Finally, a few examples will be demonstrated for the effectiveness of the proposed method.
An efficiency evaluation of Korean basketball league using 2010~2011 season data
Choi, Kyoung Ho ; Ahn, Jeong Yong ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 505~513
DOI : 10.7465/jkdi.2013.24.3.505
Basketball is one of the most popular winter sports in Korea. Korean basketball league has started in 1997 with 8 teams. Currently, there are 10 teams participating in it and the average number of spectators in 2010~2011 season reached 3,815. However, it has not been making good profit. It would be meaningful to analyze and evaluate the operational efficiency in order to provide basic information to improve the efficiency. This study used data envelopment analysis to figure out comparative efficiency of the teams in the Korean basketball league. As a result, Jeonjaland, LG, and KT were evaluated to be efficient teams and Orions, Ginseng, and Mobis were not.
Development of integrative diagnosis methods for the jaundice through statistical analysis
Shin, Im Hee ; Kwak, Sang Gyu ; Kim, Sang Gyung ; Sohn, Ki Cheul ; Jung, Hyun-Jung ; Cho, Yoon-Jeong ; Lee, A-Jin ; Kwon, O Sung ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 515~521
DOI : 10.7465/jkdi.2013.24.3.515
Healthcare approach in Western medicine and Korean Traditional Medicine (KTM) varies from its nature of human understanding and cultural differences. This fundamental difference in their approach of the human pathology has dualised and hindered common medical communication between the two fields of medicines. Within this current difficulty, the integrative medical service is said to become a novel method to provide the patients with the best medical care as their intent is to adapt and combine the advantages stated from the two different fields. This research paper shows the integrative approach of treating jaundice, where the symptoms of dampness and heat on Korean traditional standards are analyzed using statistical methods based on monitoring the blood test results. Therefore, we can explore an approach to diagnose and treat with comprehensive and integrative medicine algorithm.
The proposition of compared and attributably pure confidence in association rule mining
Park, Hee Chang ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 523~532
DOI : 10.7465/jkdi.2013.24.3.523
Generally, data mining is the process of analyzing big data from different perspectives and summarizing it into useful information. The most widely used data mining technique is to generate association rules, and it finds the relevance between two items in a huge database. This technique has been used to find the relationship between each set of items based on the interestingness measures such as support, confidence, lift, etc. Among many interestingness measures, confidence is the most frequently used, but it has the drawback that it can not determine the direction of the association. The attributably pure confidence and compared confidence are able to determine the direction of the association, but their ranges are not [-1, +1]. So we can not interpret the degree of association operationally by their values. This paper propose a compared and attributably pure confidence to compensate for this drawback, and then describe some properties for a proposed measure. The comparative studies with confidence, compared confidence, attributably pure confidence, and a proposed measure are shown by numerical example. The results show that the a compared and attributably pure confidence is better than any other confidences.
Simple principal component analysis using Lasso
Park, Cheolyong ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 533~541
DOI : 10.7465/jkdi.2013.24.3.533
In this study, a simple principal component analysis using Lasso is proposed. This method consists of two steps. The first step is to compute principal components by the principal component analysis. The second step is to regress each principal component on the original data matrix by Lasso regression method. Each of new principal components is computed as the linear combination of original data matrix using the scaled estimated Lasso regression coefficient as the coefficients of the combination. This method leads to easily interpretable principal components with more 0 coefficients by the properties of Lasso regression models. This is because the estimator of the regression of each principal component on the original data matrix is the corresponding eigenvector. This method is applied to real and simulated data sets with the help of an R package for Lasso regression and its usefulness is demonstrated.
Effectiveness of golf skills to average score using records of PGA, LPGA, KPGA, KLPGA : Multi-group path analysis
Kim, Sae Hyung ; Cho, Jung Hwan ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 543~555
DOI : 10.7465/jkdi.2013.24.3.543
This study is to analyze effectiveness of golf skills (driving distance, rating of fairway, green in regulation, sand save ratio, recovery ratio, putting average) to average score using records of PGA, LPGA, KPGA, KLPGA. Independent variables were driving distance, rating of fairway, green in regulation, sand save ratio or recovery ratio, putting average. Dependent variable was the scoring average in this study. To analyze these variables, multi-group (PGA vs LPGA, KPGA vs KLPGA, PGA vs KPGA, LPGA vs KLPGA) path analysis was used through AMOS 18.0 program and significance level was set at 0.05. As the result, the variables that show significant differences of path coefficient between PGA model and LPGA model were driving distance and green in regulation to average score. The variables that show significant differences of path coefficient between KPGA model and KLPGA model were driving distance, recovery ratio, and putting average to average score. The variables that show significant differences of path coefficient between PGA model and KPGA model were driving distance, recovery ratio, and putting average to average score. There was not significant difference of path coefficient between LPGA model and KLPGA model.
Study on the defence R&D project risk analysis using AHP
Eom, Jae-Seob ; Kim, Seung-Bum ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 557~569
DOI : 10.7465/jkdi.2013.24.3.557
Risk management activity for successful defense R&D project should be done anticipatorily and consistently over the entire project period and there should be a priority for management depending on importance of the risk factors. In this study, we verified the reliability and validity through factor analysis for the risk factors selected by the Delphi technique. We also obtained the relative importance of risk factors with analytic hierarchy process (AHP) and decided to prioritize for comparison of domestic and overseas research. According to the study, we found that it is important to settle the requirements and to classify the scope of R&D. It is also considered significant to have reasonable schedule for completion and secure the necessary resources in the early stage of project. Unlike previous studies, it appeared the technical factors are critical elements as well for defense R&D project.
Designing a life actuarial model with reflection of mortality differential by marital status
Kwon, Hyuk Sung ; Kim, Jung Eun ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 571~584
DOI : 10.7465/jkdi.2013.24.3.571
Various risk factors other than age and sex affecting human mortality have been identified and quantitatively analyzed by previous studies in many area of research. Marital status is one of key mortality risk factors which affect life expectancy directly or indirectly. Relevant results have implication on risk management for both of social and private insurance. In this paper, a mortality model to reflect mortality differential according to marital status and possible transitions among marital status is designed. Various actuarial calculations were performed and related issues were discussed.
A study on comparing short-term wind power prediction models in Gunsan wind farm
Lee, Yung-Seop ; Kim, Jin ; Jang, Moon-Seok ; Kim, Hyun-Goo ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 585~592
DOI : 10.7465/jkdi.2013.24.3.585
As the needs for alternative energy and renewable energy increase, there has been a lot of investment in developing wind energy, which does not cause air pollution nor the greenhouse gas effect. Wind energy is an environment friendly energy that is unlimited in its resources and is possible to be produced wherever the wind blows. However, since wind energy heavily relies on wind that has unreliable characteristics, it may be difficult to have efficient energy transmissions. For this reason, an important factor in wind energy forecasting is the estimation of available wind power. In this study, Gunsan wind farm data was used to compare ARMA model to neural network model to analyze for more accurate prediction of wind power generation. As a result, the neural network model was better than the ARMA model in the accuracy of the wind power predictions.
An educational tool for regression models with dummy variables using Excel VBA
Choi, Hyun Seok ; Park, Cheolyong ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 593~601
DOI : 10.7465/jkdi.2013.24.3.593
We often need to include categorial variables as explanatory variables in regression models. The categorial variables in regression models can be quantified through dummy variables. In this study, we provide an education tool using Excel VBA for displaying regression lines along with test results for regression models with a continuous explanatory variable and one or two categorical explanatory variables. The regression lines with test results are provided step by step for the model(s) with interaction(s), the model(s) without interaction(s) but with dummy variables, and the model without dummy variable(s). With this tool, we can easily understand the meaning of dummy variables and interaction effect through graphics and further decide which model is more suited to the data on hand.
Analyzing rainfall patterns and pricing rainfall insurance using copula
Choi, Changhui ; Lee, Hangsuck ; Ju, Hyo Chan ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 603~623
DOI : 10.7465/jkdi.2013.24.3.603
This paper proposes analyzing monthly rainfall patterns using copula and pricing related rainfall insurance using it. We analyze 30-year monthly precipitation data for 9 Korean cities between June and September using copula showing so that it can effectively generate realistic monthly rainfall patterns. In addition, we show that our copula rainfall models can be used in pricing various kinds of rainfall insurances effectively.
Expected shortfall estimation using kernel machines
Shim, Jooyong ; Hwang, Changha ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 625~636
DOI : 10.7465/jkdi.2013.24.3.625
In this paper we study four kernel machines for estimating expected shortfall, which are constructed through combinations of support vector quantile regression (SVQR), restricted SVQR (RSVQR), least squares support vector machine (LS-SVM) and support vector expectile regression (SVER). These kernel machines have obvious advantages such that they achieve nonlinear model but they do not require the explicit form of nonlinear mapping function. Moreover they need no assumption about the underlying probability distribution of errors. Through numerical studies on two artificial an two real data sets we show their effectiveness on the estimation performance at various confidence levels.
Maximum entropy test for infinite order autoregressive models
Lee, Sangyeol ; Lee, Jiyeon ; Noh, Jungsik ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 637~642
DOI : 10.7465/jkdi.2013.24.3.637
In this paper, we consider the maximum entropy test in in nite order autoregressiv models. Its asymptotic distribution is derived under the null hypothesis. A bootstrap version of the test is discussed and its performance is evaluated through Monte Carlo simulations.
Noninformative priors for the ratio of parameters of two Maxwell distributions
Kang, Sang Gil ; Kim, Dal Ho ; Lee, Woo Dong ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 643~650
DOI : 10.7465/jkdi.2013.24.3.643
We develop noninformative priors for a ratio of parameters of two Maxwell distributions which is used to check the equality of two Maxwell distributions. Specially, we focus on developing probability matching priors and Je reys` prior for objectiv Bayesian inferences. The probability matching priors, under which the probability of the Bayesian credible interval matches the frequentist probability asymptotically, are developed. The posterior propriety under the developed priors will be shown. Some simulations are performed for identifying the usefulness of proposed priors in objective Bayesian inference.
Quadratic inference functions in marginal models for longitudinal data with time-varying stochastic covariates
Cho, Gyo-Young ; Dashnyam, Oyunchimeg ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 651~658
DOI : 10.7465/jkdi.2013.24.3.651
For the marginal model and generalized estimating equations (GEE) method there is important full covariates conditional mean (FCCM) assumption which is pointed out by Pepe and Anderson (1994). With longitudinal data with time-varying stochastic covariates, this assumption may not necessarily hold. If this assumption is violated, the biased estimates of regression coefficients may result. But if a diagonal working correlation matrix is used, irrespective of whether the assumption is violated, the resulting estimates are (nearly) unbiased (Pan et al., 2000).The quadratic inference functions (QIF) method proposed by Qu et al. (2000) is the method based on generalized method of moment (GMM) using GEE. The QIF yields a substantial improvement in efficiency for the estimator of
when the working correlation is misspecified, and equal efficiency to the GEE when the working correlation is correct (Qu et al., 2000).In this paper, we interest in whether the QIF can improve the results of the GEE method in the case of FCCM is violated. We show that the QIF with exchangeable and AR(1) working correlation matrix cannot be consistent and asymptotically normal in this case. Also it may not be efficient than GEE with independence working correlation. Our simulation studies verify the result.
Usage of auxiliary variable and neural network in doubly robust estimation
Park, Hyeonah ; Park, Wonjun ;
Journal of the Korean Data and Information Science Society, volume 24, issue 3, 2013, Pages 659~667
DOI : 10.7465/jkdi.2013.24.3.659
If the regression model or the propensity model is correct, the unbiasedness of the estimator using doubly robust imputation can be guaranteed. Using a neural network instead of a logistic regression model for the propensity model, the estimators using doubly robust imputation are approximately unbiased even though both assumed models fail. We also propose a doubly robust estimator of ratio form using population information of an auxiliary variable. We prove some properties of proposed theory by restricted simulations.