Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Journal of the Korean Data and Information Science Society
Journal Basic Information
Journal DOI :
Korean Data and Information Science Society
Editor in Chief :
Volume & Issues
Volume 25, Issue 6 - Nov 2014
Volume 25, Issue 5 - Sep 2014
Volume 25, Issue 4 - Jul 2014
Volume 25, Issue 3 - May 2014
Volume 25, Issue 2 - Mar 2014
Volume 25, Issue 1 - Jan 2014
Selecting the target year
Standard criterion of hypervolume under the ROC manifold
Hong, C.S. ; Jung, D.G. ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 473~483
DOI : 10.7465/jkdi.2014.25.3.473
Even though the ROC manifold for more than three dimensional space which is an extension of the ROC curve and surface has difficulty to represent graphically, the hypervolume under the ROC manifold (HUM) statistic can be defined and obtained based on AUC and VUS measures for the ROC curve and the ROC surface. Hence the definition and characteristics of the HUM for four dimensional space are studied in this work. By extension of the standard criterion of AUC for probabilities of default based on Basel II, the 13 classes of standard criterion of HUM are proposed in order to discriminate four classification models and some application methods are discussed. In order to explore the standard criterion of HUM whose values are obtained from various distributions, ternary plot is used and explained.
Pitching grade index in Korean pro-baseball
Lee, Jang Taek ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 485~492
DOI : 10.7465/jkdi.2014.25.3.485
In baseball, the traditional measure of pitchers are wins and ERA. But these statistics are influenced by luck or team power. So sabermetrician proposes a number of indicators that predict future performance. We determine a new measure, which we call pitching grade index (PGI) that efficiently summarizes a pitcher`s performance on a numerical scale using principal components analysis. The PGI statistic can often be useful to assessing a pitcher`s individual contribution. Also K-means clustering algorithm are used for segmentation of players into groups.
Estimation of exponent value for Pythagorean method in Korean pro-baseball
Lee, Jang Taek ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 493~499
DOI : 10.7465/jkdi.2014.25.3.493
The Pythagorean won-loss formula postulated by James (1980) indicates the percentage of games as a function of runs scored and runs allowed. Several hundred articles have explored variations which improve RMSE by original formula and their fit to empirical data. This paper considers a variation on the formula which allows for variation of the Pythagorean exponent. We provide the most suitable optimal exponent in the Pythagorean method. We compare it with other methods, such as the Pythagenport by Davenport and Woolner, and the Pythagenpat by Smyth and Patriot. Finally, our results suggest that proposed method is superior to other tractable alternatives under criterion of RMSE.
A study on transition of programming academic achievement for H/W majors
Lee, Seung-Woo ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 501~512
DOI : 10.7465/jkdi.2014.25.3.501
The purpose of this study is to improve the academic achievement of H/W majors. Firstly, this paper proposes the educational case study that develops the learner`s ability, increases the interest in the unfavorable programming fields for the H/W majors, and plans to raise employment rate of programming. Secondly, this paper presents the future teaching method on programming driving improvement for the employment rate on the basis of the department`s special characteristics and the actual circumstances in the field of the H/W. Lastly, this paper suggests the promising pedagogical method for educating programming by using a survey and the case studies.
An estimation of implied volatility for KOSPI200 option
Choi, Jieun ; Lee, Jang Taek ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 513~522
DOI : 10.7465/jkdi.2014.25.3.513
Using the assumption that the price of a stock follows a geometric Brownian motion with constant volatility, Black and Scholes (BS) derived a formula that gives the price of a European call option on the stock as a function of the stock price, the strike price, the time to maturity, the risk-free interest rate, the dividend rate paid by the stock, and the volatility of the stock`s return. However, implied volatilities of BS method tend to depend on the stock prices and the time to maturity in practice. To address this shortcoming, we estimate the implied volatility function as a function of the strike priceand the time to maturity for data consisting of the daily prices for KOSPI200 call options from January 2007 to May 2009 using support vector regression (SVR), the multiple additive regression trees (MART) algorithm, and ordinary least squaress (OLS) regression. In conclusion, use of MART or SVR in the BS pricing model reduced both RMSE and MAE, compared to the OLS-based BS pricing model.
A study on design effect models for complex sample survey
Park, Inho ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 523~531
DOI : 10.7465/jkdi.2014.25.3.523
Design effect is often used in designing and planning sample surveys and/or in evaluating the efficiency of complex design features of the surveys. In this study, we applied Gabler et al. (2006)`s design effect model to 2013 Consumer behavior survey for food that was carried out by stratified two-stage sampling. Usability and adequacy of the design model to a real survey data are discussed and evaluated.
A survey study of farmers` recognition on reality of Hanwoo raising and improving quality : Focused on Gyeongsangbuk-Do
Kim, Byung-Ki ; Oh, Dong-Yep ; Jung, Dae-Jin ; Lee, Jea-Young ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 533~545
DOI : 10.7465/jkdi.2014.25.3.533
Farmers` perception on actual raising conditions and breeding improvement for Hanwoo were surveyed and analyzed in order to utilize such data as basic resources for further development of courses of Hanwoo improvement and instructions on raising techniques. The survey was held based on Hanwoo farmers in Gyeongbuk region and the results for the analysis were as follow. Candidate cattle for breeding was selected in consideration of `appearance, body shape, and pedigree-registration` (39.0%) and `artificial insemination` (38.6%) was the most frequently used breeding method for the breeding cattle. `Body length` was revealed to be the most considered factor while purchasing fattening calves and the castration for the fattening calves were mostly performed when `6~7 months after the birth`. The farmers also responded that they `try to comply with over 80% of items specified in program for production of high quality beef` in order to produce high quality beef. However, the farmers believed that `12 months after the birth` was the most economic market month. Although the results differed by each items surveyed, majority of those results showed statistically significant differences with significance level of 0.05 upon the surveyees` general characteristics and demographic factors including level of education, age, occupation, and family man power. Most surveyees responded `around 30% of shipping heads` (22.1%) for the prevalence of beef graded better than 1++ grade when shipping, however, no significant differences in between general characteristics of surveyees were observed.
Projection analysis for two-way variance components
Choi, Jaesung ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 547~554
DOI : 10.7465/jkdi.2014.25.3.547
This paper discusses a method of estimating variance components for random effects model. Henderson`s method I and III are discussed for the esimation of variance components. This paper shows how to use projections instead of using Henderson`s methods for the calculation of sums of squares which are quadratic forms in the observations. It also discusses that eigenvalues can be used for getting the expectations of sums of squares in place of using the method of Hartley`s synthesis. It shows the suggested method is much more effective than those methods.
Analysis of sports injuries among Korean national players during official training
Kim, Eun Kuk ; Kim, Tae Gyu ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 555~565
DOI : 10.7465/jkdi.2014.25.3.555
The purpose of this study was to analyze sports injuries which occurred in Korea National Team during official training period. All sports injuries are recorded on injury report form by physicians, medical staffs and athletic trainer, and only acute and recurred injuries were analyzed. Total 3,421 injuries were reported, and 1,560 injuries were newly incurred and 1,861 injuries were recurrent with previous history. The frequency of new injuries in male and female athletes was highest in boxing (n
Study on the validity of PEAS for analyzing doping attitude and disposition of Korean elite player through Rasch model
Kim, Tae Gyu ; Kim, Sae Hyung ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 567~578
DOI : 10.7465/jkdi.2014.25.3.567
PEAS (performance enhancement attitude scale) has been used to measure attitude and disposition toward doping in elite athlete. It is constructed of 17-item, 6-point scale. The purpose of this study was to verify validity of the PEAS for Korean elite player through Rasch model. The scale was administered to 438 Korean elite players. Principal component analysis was used to verify unidimensionality using SPSS program. Rasch measurement computer program, WISTEPS, was used to estimate goodness-of-fit of items and category structure. Differenctial item functioning by gender was also estimated by the WINSTEPS program. All alpha level was set at 0.05. First, principal component analysis showed that unidimensionality is satisfied as over 20.0% of variance of eigenvalue. Second, category probabilities curve showed 5-point scale was better than 6-point scaled statistically. Third, seven items (1, 9, 10, 12, 13, 14, 17) in the 17-item were not good model fit and three items (3, 12, 13) were estimated as the differential item functioning. This study showed that 9-item, 5-point scale is better PEAS to Korean elite player.
A systematic review of studies using time series analysis of health and welfare in Korea
Woo, Kyung-Sook ; Shin, Young-Jeon ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 579~599
DOI : 10.7465/jkdi.2014.25.3.579
The purpose of this study was to identify the trends and risk of bias of research using time series analysis on health and welfare in Korea and to suggest a direction for future health and welfare research. The database searches identified 6,543 papers. Following the process for screening and selecting, a total of 91 papers were included in the systematic review. There has been a steady increase in the number of articles using time series analysis from 1987 to 2013. Time series analysis was applied in medicine and health science journals. The main goals were explanation and description. Most of the subjects were heath status and utilization of healthcare services. The main model used in the time series analysis was ARIMA followed by time series regression. The data were gathered from various sources, including the national statistical office and government agencies. For assessing risk of bias, some studies were found to have inadequate sample sizes or showed no time series graphs and plots. These findings suggest greater widespread utilization of time series analysis in the field of health and welfare and to use the appropriate analysis methods and statistical procedures to obtain more reliable results to improve the quality of research.
The development of symmetrically and attributably pure confidence in association rule mining
Park, Hee Chang ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 601~609
DOI : 10.7465/jkdi.2014.25.3.601
The most widely used data mining technique for big data analysis is to generate meaningful association rules. This method has been used to find the relationship between set of items based on the association criteria such as support, confidence, lift, etc. Among them, confidence is the most frequently used, but it has the drawback that we can not know the direction of association by it. The attributably pure confidence was developed to compensate for this drawback, but the value was changed by the position of two item sets. In this paper, we propose four symmetrically and attributably pure confidence measures to compensate the shortcomings of confidence and the attributably pure confidence. And then we prove three conditions of interestingness measure by Piatetsky-Shapiro, and comparative studies with confidence, attributably pure confidence, and four symmetrically and attributably pure confidence measures are shown by numerical examples. The results show that the symmetrically and attributably pure confidence measures are better than confidence and the attributably pure confidence. Also the measure NSAPis found to be the best among these four symmetrically and attributably pure confidence measures.
Nonparametric Bayesian estimation on the exponentiated inverse Weibull distribution with record values
Seo, Jung In ; Kim, Yongku ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 611~622
DOI : 10.7465/jkdi.2014.25.3.611
The inverse Weibull distribution (IWD) is the complementary Weibull distribution and plays an important role in many application areas. In Bayesian analysis, Soland`s method can be considered to avoid computational complexities. One limitation of this approach is that parameters of interest are restricted to a finite number of values. This paper introduce nonparametric Bayesian estimator in the context of record statistics values from the exponentiated inverse Weibull distribution (EIWD). In stead of Soland`s conjugate piror, stick-breaking prior is considered and the corresponding Bayesian estimators under the squared error loss function (quadratic loss) and LINEX loss function are obtained and compared with other estimators. The results may be of interest especially when only record values are stored.
Reliability estimation and ratio distribution in a general exponential distribution
Lee, Chang-Soo ; Moon, Yeung-Gil ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 623~632
DOI : 10.7465/jkdi.2014.25.3.623
We shall consider the estimation for the parameter and the right tail probability in a general exponential distribution. We also shall consider the estimation of the reliability P(X < Y ) and the skewness trends of the density function of the ratio X
Estimation of the exponential distribution based on multiply Type I hybrid censored sample
Lee, Kyeongjun ; Sun, Hokeun ; Cho, Youngseuk ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 633~641
DOI : 10.7465/jkdi.2014.25.3.633
The exponential distibution is one of the most popular distributions in analyzing the lifetime data. In this paper, we propose multiply Type I hybrid censoring. And this paper presents the statistical inference on the scale parameter for the exponential distribution when samples are multiply Type I hybrid censoring. The scale parameter is estimated by approximate maximum likelihood estimation methods using two different Taylor series expansion types (
). We also obtain the maximum likelihood estimator (MLE) of the scale parameter
under the proposed multiply Type I hybrid censored samples. We compare the estimators in the sense of the root mean square error (RMSE). The simulation procedure is repeated 10,000 times for the sample size n
Estimation for the extreme value distribution under progressive Type-I interval censoring
Nam, Sol-Ji ; Kang, Suk-Bok ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 643~653
DOI : 10.7465/jkdi.2014.25.3.643
In this paper, we propose some estimators for the extreme value distribution based on the interval method and mid-point approximation method from the progressive Type-I interval censored sample. Because log-likelihood function is a non-linear function, we use a Taylor series expansion to derive approximate likelihood equations. We compare the proposed estimators in terms of the mean squared error by using the Monte Carlo simulation.
Depression and suicidal ideation in community-dwelling older adults in Korea
Kwon, So-Hi ; Sohn, Myungji ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 655~663
DOI : 10.7465/jkdi.2014.25.3.655
This study aimed to investigate the prevalence of depression and suicidal ideation in community-dwelling older adults in Korea, as well as identify factors associated with their occurrence, including cognitive impairment. A cross-sectional study of 484 residents was conducted at a senior centre utilising the PHQ-9K and K-MMSE. Demographic data were also collected for analysis. Of the respondents, 38.1% had symptoms of mild to severe depression. Further, 16.7% reported having suicidal ideation, with 5% of respondents having thoughts of suicide every day. The majority of participants had `normal` scores on the K-MMSE (88.0%), though significant differences were observed in PHQ-9K scores between cognitive-acceptable and cognitive-impaired groups. Depressive symptoms and suicidal ideation were very prevalent in community-dwelling older adults in Korea. This study indicates the need for the development of community-based mental health programs tailored to older adults, and demonstrates the viability of promoting early detection of depressive symptoms through senior centres.
Social media comparative analysis based on multidimensional scaling
Lee, Hanjun ; Suh, Yongmoo ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 665~676
DOI : 10.7465/jkdi.2014.25.3.665
As social media draws attention as a business tool, organizations, large or small, are trying to exploit social media in their business. However, lack of understanding the characteristics of each social media led them to develop a naive strategy for dealing with social media. Thus, this study aims to deepen the understanding by comparatively analyzing how social media users perceive (the image of) each social media. Facebook, Twitter, YouTube, Blogs, Communities and Cyworld were chosen for our study and data from 132 respondents were analyzed using multidimensional scaling technique. The results show that there are meaningful differences in users` perception of social media attributes, which are grouped into four; information feature, motivation, promotion tool, usability. It is also analyzed whether such differences can be found between male and female users. (Such differences are also analyzed in both male and female users` perceptions.) Further, we discuss some implications of the research results for both practitioners and researchers.
Support vector quantile regression ensemble with bagging
Shim, Jooyong ; Hwang, Changha ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 677~684
DOI : 10.7465/jkdi.2014.25.3.677
Support vector quantile regression (SVQR) is capable of providing more complete description of the linear and nonlinear relationships among random variables. To improve the estimation performance of SVQR we propose to use SVQR ensemble with bagging (bootstrap aggregating), in which SVQRs are trained independently using the training data sets sampled randomly via a bootstrap method. Then, they are aggregated to obtain the estimator of the quantile regression function using the penalized objective function composed of check functions. Experimental results are then presented, which illustrate the performance of SVQR ensemble with bagging.
Robust Bayesian inference in finite population sampling with auxiliary information under balanced loss function
Kim, Eunyoung ; Kim, Dal Ho ;
Journal of the Korean Data and Information Science Society, volume 25, issue 3, 2014, Pages 685~696
DOI : 10.7465/jkdi.2014.25.3.685
In this paper, we develop Bayesian inference of the finite population mean with the assumption of posterior linearity rather than normality of the superpopulation in the presence of auxiliary information under the balanced loss function. We compare the performance of the optimal Bayes estimator under the balanced loss function with ones of the classical ratio estimator and the usual Bayes estimator in terms of the posterior expected losses, risks and Bayes risks.