Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Journal of the Korean Data and Information Science Society
Journal Basic Information
Journal DOI :
Korean Data and Information Science Society
Editor in Chief :
Volume & Issues
Volume 22, Issue 6 - Dec 2011
Volume 22, Issue 5 - Oct 2011
Volume 22, Issue 4 - Jul 2011
Volume 22, Issue 3 - May 2011
Volume 22, Issue 2 - Mar 2011
Volume 22, Issue 1 - Jan 2011
Selecting the target year
Classification accuracy measures with minimum error rate for normal mixture
Hong, C.S. ; Lin, Meihua ; Hong, S.W. ; Kim, G.C. ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 619~630
In order to estimate an appropriate threshold and evaluate its performance for the data mixed with two different distributions, nine kinds of well-known classification accuracy measures such as MVD, Youden's index, the closest-to- (0,1) criterion, the amended closest-to- (0,1) criterion, SSS, symmetry point, accuracy area, TA, TR are clustered into five categories on the basis of their characters. In credit evaluation study, it is assumed that the score random variable follows normal mixture distributions of the default and non-default states. For various normal mixtures, optimal cut-off points for classification measures belong to each category are obtained and type I and II error rates corresponding to these cut-off points are calculated. Then we explore the cases when these error rates are minimized. If normal mixtures might be estimated for these kinds of real data, we could make use of results of this study to select the best classification accuracy measure which has the minimum error rate.
Paradox in collective history-dependent Parrondo games
Lee, Ji-Yeon ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 631~641
We consider a history-dependent Parrondo game in which the winning probability of the present trial depends on the results of the last two trials in the past. When a fraction of an infinite number of players are allowed to choose between two fair Parrondo games at each turn, we compare the blind strategy such as a random sequence of choices with the short-range optimization strategy. In this paper, we show that the random sequence of choices yields a steady increase of average profit. However, if we choose the game that gives the higher expected profit at each turn, surprisingly we are not supposed to get a long-run positive profit for some parameter values.
Development of process-oriented education tool for Statistics with Excel Macro
Choi, Hyun-Seok ; Ha, Jeong-Cheol ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 643~650
Recently the needs for education on Statistics is growing bigger, but a mathematics-oriented education makes college students loose interest in Statistics. On the hypothesis that motivating interest is the key factor for learning, we need to develop an education tool for Statistics that makes learners to study independently but throughly. By using Excel Macro, we develop and introduce add-in program, called PETS, which supplies not only results but also process to get them.
An empirical study on the perception of probability and statistics: With focus on S/W and H/W majors
Lee, Seung-Woo ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 651~660
This study aims at improving teaching and learning abilities on the courses of probability/statistics in the fields of the S/W and H/W. In order to do this, this paper firstly conducts a survey which measures the perception of the surveyees' necessity of the related courses, and includes the contents that the related courses should cover. Secondly, this paper analyzes the educational effect on the achievement by studying Pattern Recognition, a major course of S/W and H/W, with combining probability/statistics or data analysis. Lastly, this paper suggests the promising pedagogical method for educating probability/statistics by using a survey and the case studies. In this way, this paper shows the necessity of probability/statistics for acquiring a new technology and the flexible approach of various subjects.
Study on identification of candidate DNA marker related with beef quailty in QTL region of BTA 2 in Hanwoo population
Lee, Yoon-Seok ; Oh, Dong-Yep ; Yeo, Jung-Sou ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 661~669
By direct sequencing of 12 STS marker, we identified 10 polymorphic SNPs. As a result of genotype frequency analysis between 10 polymorphic SNPs and extreme population (n=20) for marbling score in Hanwoo (n=233), there was over 40 percent of frequency difference of HWSNP_1-1 and HWSNP_9-4 SNP. HWSNP_1-1 SNP was significantly associated with marbling score in large-scale population (n=233). Therefore we suggested that HWSNP_1-1 SNP can be useful as a positional candidate for beef quality for marker-assisted selection in Hanwoo.
A study on decision tree creation using intervening variable
Cho, Kwang-Hyun ; Park, Hee-Chang ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 671~678
Data mining searches for interesting relationships among items in a given database. The methods of data mining are decision tree, association rules, clustering, neural network and so on. The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, customer classification, etc. When create decision tree model, complicated model by standard of model creation and number of input variable is produced. Specially, there is difficulty in model creation and analysis in case of there are a lot of numbers of input variable. In this study, we study on decision tree using intervening variable. We apply to actuality data to suggest method that remove unnecessary input variable for created model and search the efficiency.
Comparison of monitoring the output variable and the input variable in the integrated process control
Lee, Jae-Heon ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 679~690
Two widely used approaches for improving the quality of the output of a process are statistical process control (SPC) and automatic process control (APC). In recent hybrid processes that combine aspects of the process and parts industries, process variations due to both the inherent wandering and special causes occur commonly, and thus simultaneous application of APC and SPC schemes is needed to effectively keep such processes close to target. The simultaneous implementation of APC and SPC schemes is called integrated process control (IPC). In the IPC procedure, the output variables are monitored during the process where adjustments are repeatedly done by its controller. For monitoring the APC-controlled process, control charts can be generally applied to the output variable. However, as an alternative, some authors suggested that monitoring the input variable may improve the chance of detection. In this paper, we evaluate the performance of several monitoring statistics, such as the output variable, the input variable, and the difference variable, for efficiently monitoring the APC-controlled process when we assume IMA(1,1) noise model with a minimum mean squared error adjustment policy.
Development of model for prediction of land sliding at steep slopes
Park, Ki-Byung ; Joo, Yong-Sung ; Park, Dug-Keun ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 691~699
Land sliding is one of well-known nature disaster. As a part of effort to reduce damage from land sliding, many researchers worked on increasing prediction ability. However, because previous studies are conducted mostly by non-statisticians, previously proposed models were hardly statistically justifiable. In this paper, we predicted the probability of land sliding using the logistic regression model. Since most explanatory variables under consideration were correlated, we proposed the final model after backward elimination process.
A study on preferable contents depending on regions and terminal types for high speed mobile internet
Ryu, Gui-Yeol ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 701~715
The object of study is what kinds of contents are preferable depending on regions and terminal types for high speed mobile internet. We consider 10 contents, 9 region2, and 2 terminal types. The methods are the adjusted residuals, the corresponding analysis, and the multiple corresponding analysis. The results are different, which comes from different mathematical models. 50% of results are same between the corresponding analysis and the multiple corresponding analysis. 26.3% are same between the adjusted residuals and the corresponding analysis. 21.1% are same between the adjusted residuals and the multiple corresponding analysis. We recommend the adjusted residuals because it can test hypothesis for preferring contents. The content which is not chosen by three methods is business.
Meta-regression analysis for anti-diabetic effect of green tea
Yun, A-Reum ; Choi, Ki-Heon ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 717~726
The present study was carried out to summarize the effect of green tea in the diabetic rats by meta-analysis related studies. The association measure to test effect of green tea was Hedges' standardized mean difference. In this particular fixed effect model, body weight was significantly increased. Also, blood glucose, triglycerides were significantly decreased. In this case of heterogeneous variable, random effect model was applied. In this model, body weight was significantly increased. Also, blood glucose was significantly decreased in green tea treated group. According to the Meta-regression analysis, duration of injection was not significant for variables.
A Random Matrix Theory approach to correlation matrix in Korea Stock Market
Kim, Geon-Woo ; Lee, Sung-Chul ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 727~733
To understand the stock market structure it is very important to extract meaningful information by analyzing the correlation matrix between stock returns. Recently there has been many studies on the correlation matrix using the Random Matrix Theory. In this paper we adopt this random matrix methodology to a single-factor model and we obtain meaningful information on the correlation matrix. In particular we observe the analysis of the correlation matrix using the single-factor model explains the real market data and as a result we confirm the usefulness of the single-factor model.
Statistical analysis for small power module
Shin, Jae-Kyoung ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 735~740
In recent, electronic devices were able to develop and focus for ultra-compact size, intelligence, multifunction and broadband. Their SMPS is realized to ultra-compact size, light weight, high efficiency, high reliability, low noises. The power module which can be used to supply DC output from a commercial power supply (85 to 265 VAC). A switching power supply can be made easily by adding simply external circuit, such as microcontroller, a relay, etc. It would be apply to mostly electronic devices, and fit the global project "Saving energy". But we need to statistical analysis for a quality and performance about a load and an output voltage in product.
Comparison of clustering with yeast microarray gene expression data
Lee, Kyung-A ; Kim, Jae-Hee ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 741~753
We accomplish clustering analyses for yeast cell cycle microarray expression data. We compare model-based clustering, K-means, PAM, SOM and hierarchical Ward method with yeast data. As the validity measure for clustering results, connectivity, Dunn Index and silhouette values are computed and compared.
Predicting the future number of failures based on the field failure summary data
Baik, Jai-Wook ; Jo, Jin-Nam ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 755~764
In many companies field failure data is used to predict the future number of failures, especially when an unexpected failure mode happens to be a problem. It is because they want to predict the number of spare parts needed and the future quality warranty cost associated with the part based on the predictions of the future number of failures. In this paper field summary data is used to predict the future number of failures based on an appropriate distribution. Other types of data are also investigated to identify the appropriate distribution.
A study measuring university educational service quality using importance-satisfaction transformed index
Choi, Kyoung-Ho ; Kang, Sung ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 765~773
Today, as the number of applicants for admission decreases, competition among universities is deepening in Korea. Especially, the existence of local universities has led to intense competition to increase the enrollment rate of new students and reduce dropout rate. To survive in this competition, local universities are making various efforts; however, the primary problem is improving their educational service quality. In this study, we have developed a device to measure educational service quality which can be applied to the field of higher education, and factors that determine educational service quality are dragged through this device. In addition, this research identifies which statistically significant factors play a part in overall satisfaction and word of mouth effect, and interprets 29 quality attributes using importance-satisfaction transformed index.
Preventive maintenance model following the expiration of NFRRW
Jung, Ki-Mun ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 775~784
In this paper, we consider the periodic preventive maintenance model for repairable system following the expiration of non-renewing free replacement-repair warranty (NFRRW). Under this preventive maintenance model, we derive the expressions for the expected cycle length, the expected total cost and the expected cost rate per unit time. Also, we determine the optimal preventive maintenance period and the optimal preventive maintenance number by minimizing the expected cost rate per unit time. Finally, the optimal periodic preventive maintenance policy is given for Weibull distribution case.
Effect of threats to anonymity on data reliability in internet survey
Heo, Sun-Yeong ; Chang, Duk-Joon ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 785~794
The population of internet users are rapidly increasing and the interest of the internet survey is also increasing. Recent years has seen a transition from traditional modes of data collection into internet survey. Some surveys are administered with mixed modes of traditional data collection methods and internet survey, and some surveys are conducted through internet only instead of traditional modes, such as telephone survey, postal survey, face-to-face interviews and so on. However, one of most crucial parts of a survey is the reliability of the collected data and internet survey is no exemption. Changwon National University has been annually conducting a survey of new students and transfer students with almost same contents of questionnaire. The survey is a longitudinal survey and it had been administered by paper-pencil surveys until 2009. In 2010 the survey was administered through internet. Every students has to login with student ID number and the last 7-digit of national identity registration number, and complete the 2010 survey before registration their courses. If they leave any question without being answered, then could not move to the registration site for courses. This study explores the distortion of responses using the new students survey of Changwon National University, which could occur when the survey responses are not confidential. We find that the distortion of responses occurs from the questions with social desirability pressure, pressure of winning favor with the researcher, and pressure of explaining their situations. There are no distortion of responses from the questions which are describing simple opinions or simple facts, for example, the place they plan to live while in school.
Variable selection in the kernel Cox regression
Shim, Joo-Yong ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 795~801
In machine learning and statistics it is often the case that some variables are not important, while some variables are more important than others. We propose a novel algorithm for selecting such relevant variables in the kernel Cox regression. We employ the weighted version of ANOVA decomposition kernels to choose optimal subset of relevant variables in the kernel Cox regression. Experimental results are then presented which indicate the performance of the proposed method.
Control charts for monitoring correlation coefficients in variance-covariance matrix
Chang, Duk-Joon ; Heo, Sun-Yeong ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 803~809
Properties of multivariate Shewhart and CUSUM charts for monitoring variance-covariance matrix, specially focused on correlation coefficient components, are investigated. The performances of the proposed charts based on control statistic Lawley-Hotelling
and likelihood ratio test (LRT) statistic
are evaluated in terms of average run length (ARL). For monitoring correlation coe cient components of dispersion matrix, we found that CUSUM chart based on
gives relatively better performances and is more preferable, and the charts based on
perform badly and are not recommended.
Estimating reliability in discrete distributions
Moon, Yeung-Gil ; Lee, Chang-Soo ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 811~817
We shall introduce a general probability mass function which includes several discrete probability mass functions. Especially, when the random variable X is Poisson, binomial, and negative binomial random variables as some special cases of the introduced distribution, the maximum likelihood estimator (MLE) and the uniformly minimum variance unbiased estimator (UMVUE) of the probability P(X
t) are considered. And the efficiencies of the MLE and the UMVUE of the reliability ar compared each other.
Reference priors for nonregular Pareto distribution
Kang, Sang-Gil ; Kim, Dal-Ho ; Lee, Woo-Dong ;
Journal of the Korean Data and Information Science Society, volume 22, issue 4, 2011, Pages 819~826
In this paper, we develop the reference priors for the scale and shape parameters in the nonregular Pareto distribution. We derive the reference priors as noninformative priors and prove the propriety of joint posterior distribution under the general priors including reference priors in the order of inferential importance. Through the simulation study, we compare the reference priors with respect to coverage probabilities of parameter of interest in a frequentist sense.