Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Journal of the Korean Data and Information Science Society
Journal Basic Information
Journal DOI :
Korean Data and Information Science Society
Editor in Chief :
Volume & Issues
Volume 22, Issue 6 - Dec 2011
Volume 22, Issue 5 - Oct 2011
Volume 22, Issue 4 - Jul 2011
Volume 22, Issue 3 - May 2011
Volume 22, Issue 2 - Mar 2011
Volume 22, Issue 1 - Jan 2011
Selecting the target year
Undecided inference using bivariate probit models
Hong, Chong-Sun ; Jung, Mi-Yang ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1017~1028
When it is not easy to decide the credit scoring for some loan applicants, credit evaluation is postponded and reserve to ask a specialist for further evaluation of undecided applicants. This undecided inference is one of problems that happen to most statistical models including the biostatistics and sportal statistics as well as credit evaluation area. In this work, the undecided inference is regarded as a missing data mechanism under the assumption of MNAR, and use the bivariate probit model which is one of sample selection models. Two undecided inference methods are proposed: one is to make use of characteristic variables to represent the state for decided applicants, and the other is that more accurate and additional informations are collected and apply these new variables. With an illustrated example, misclassification error rates for undecided and overall applicants are obtainded and compared according to various characteristic variables, undecided intervals, and thresholds. It is found that misclassification error rates could be reduced when the undecided interval is increased and more accurate information is put to model, since more accurate situation of decided applications are reflected in the bivariate probit model.
Online SLAM algorithm for mobile robot
Kim, Byung-Joo ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1029~1040
In this paper we propose an intelligent navigation algorithm for real world problem which can build a map without localization. Proposed algorithm operates online and furthermore does not require many memories for applying real world problem. After applying proposed algorithm to toy and huge data set, it does not require to calculate a whole eigenspace and need less memory compared to existing algorithm. Thus we can obtain that proposed algorithm is suitable for real world mobile navigation algorithm.
Contingent valuation method implemented by R: Case study - measuring value of information
Jung, Byung-Joon ; Pak, Ro-Jin ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1041~1051
The development of information technology provides us with more useful information but it arose to protect such information from inappropriate users. In the course of analyzing and managing the risks associated with information, it should be needed to accurately measure the value of information. We try to consider the contingent valuation method for this purpose. The contingent valuation method which is used to assess the value of public goods or nonmarket goods makes an statistical estimation for the willingness-to-pay. We show with an example how we can estimate the value of information by calculating the amount we are willing to pay the value of information that exists on the information system. Calculation is carried out by using R.
Social network analysis for a soccer game
Choi, Seung-Bae ; Kang, Chang-Wan ; Choi, Hyong-Jun ; Kang, Byung-Yuk ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1053~1063
Social network analysis is the social statistical analysis of any social structure involving a stream of mutual information between observations. In this study we used the results of passes between players in a soccer game. The analysis contents are as follows. (1) Players with important or leading roles are identified. (2) Players are assessed by pass frequency and the success rate of passes. The purpose of this study is for use as basic data for future team strategy, and achieves this by evaluating the role of each individual player within a team. In this study, social network analysis without separating positions is conducted, and is also performed for defensive and attacking positions respectively. The results of this study are as follows: First, when complete team data were available, the players performing leadership roles were Jung-woo Kim, Sung-yeung Ki and Chung-young Lee, whereas Jeong-su Lee acted as a sub-leader. In case of data for defensive positions Jeong-su Lee was a leading player, and in terms of attacking positions, all of the players excelled in the game and could be evaluated as playing lead roles.
A numerical study on portfolio VaR forecasting based on conditional copula
Kim, Eun-Young ; Lee, Tae-Wook ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1065~1074
During several decades, many researchers in the field of finance have studied Value at Risk (VaR) to measure the market risk. VaR indicates the worst loss over a target horizon such that there is a low, pre-specified probability that the actual loss will be larger (Jorion, 2006, p.106). In this paper, we compare conditional copula method with two conventional VaR forecasting methods based on simple moving average and exponentially weighted moving average for measuring the risk of the portfolio, consisting of two domestic stock indices. Through real data analysis, we conclude that the conditional copula method can improve the accuracy of portfolio VaR forecasting in the presence of high kurtosis and strong correlation in the data.
A study on tourist satisfaction of the Daegu City Tour using a structural equation model
Song, Mi-Jung ; Lee, Ji-Yeon ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1075~1087
We analyze the tourist satisfaction of the Daegu City Tour which plays a big role in the local tourism boosting through '2011 Visit Daegu Year'. To analyze a causal relation among factors, we proposed a structural equation model consisting of four latent variables of the tour: motivation, expectation, satisfaction, and future behaviors. Using data from the actual tourists of the Daegu City Tour, we found out that tourists' motivation before the tour does not affect tourists' satisfaction after the tour. However those who have higher motivation have positive future behavior and those who have the higher expectation are more satisfied with the tour. Meanwhile, the expectation before the tour does not lead the future behavior but the satisfaction after the tour influences the positive future behavior.
An optimal policy for an infinite dam with exponential inputs of water
Kim, Myung-Hwa ; Baek, Jee-Seon ; Choi, Seung-Kyoung ; Lee, Eui-Yong ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1089~1096
We consider an infinite dam with inputs formed by a compound Poisson process and adopt a
-policy to control the level of water, where the water is released at rate M when the level of water exceeds threshold
. We obtain interesting stationary properties of the level of water, when the amount of each input independently follows an exponential distribution. After assigning several managing costs to the dam, we derive the long-run average cost per unit time and show that there exist unique values of releasing rate M and threshold
which minimize the long-run average cost per unit time. Numerical results are also illustrated by using MATLAB.
Probability and statistics curriculum in school
Oh, Kwan-Sik ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1097~1103
The Ministry of Education Science, and Technology proclamated the school curriculum of the republic of Korea at 9 August 2011. The characteristics of this curriculum are as follows; effective learning of student-centered, group of grade, group of category, concentration study, reduction of study subjects, extension of school autonomy, elective curriculum in high school. We investigate the modification of probability and statistics curriculum. And we discuss statistics education in university.
Nonparametric procedures using placement in randomized block design with replications
Lee, Sang-Yi ; Kim, Dong-Jae ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1105~1112
Mack (1981), Skilling and Wolfe (1977, 1978) proposed typical nonparametric method in randomized block design with replications. In this paper, we proposed the procedures based on placement as extension of the two sample placement tests described in Orban and Wolfe (1982) and treatment versus control tests described in Kim (1999). Also Monte Carlo simulation study is adapted to compare power of the proposed procedure with those of previous procedures.
Association rule thresholds of similarity measures considering negative co-occurrence frequencies
Park, Hee-Chang ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1113~1121
Recently, a variety of data mining techniques has been applied in various fields like healthcare, insurance, and internet shopping mall. Association rule mining is a popular and well researched method for discovering interesting relations among large set of data items. Association rule mining is the method to quantify the relationship between each set of items in very huge database based on the association thresholds. There are three primary quality measures for association rules; support and confidence and lift. In this paper we consider some similarity measures with negative co-occurrence frequencies which is widely used in cluster analysis or multi-dimensional analysis as association thresholds. The comparative studies with support, confidence and some similarity measures are shown by numerical example.
A readjustment procedure in the multivariate integrated process control
Cho, Gyo-Young ; Park, Jong-Suk ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1123~1135
This paper considers the multivariate integrated process control procedure for detecting special causes in a multivariate IMA(1, 1) process. When the multivariate control chart signals, the special cause will be detected and eliminated from the process. However, when the elimination of the special cause costs high or is not practically possible, an alternative action is to readjust the process with approximately modified adjustment scheme. In this paper, we propose the readjustment procedure after having a true signal, and show that the use of the readjustment can reduce the deviation of a process from the target.
The preference for direct marketing according to the characteristics of policyholders in the life insurance industry
Jung, Se-Chang ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1137~1143
The purpose of this paper is to analyse the preference for direct marketing according to the characteristics of policyholders and suggest implications for marketing strategies with regard to direct marketing. A marked characteristic of this paper is a good quality of data and the results gained from analysing the data can be trusted very much. Binary logistic regression is employed. A statistically significant preference is shown in the group such as male, a younger generation, a hazardous occupation, the metropolitan area, and the customer of foreign company. The results suggest that promotion for female is needed to revitalize direct marketing. A tight underwriting for a hazardous occupation is also required.
Estimation of genetic parameters using real-time ultrasound measurements in Hanwoo
Lee, Ji-Hong ; Yeo, Jung-Sou ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1145~1152
This study was conducted to estimate genetic effects on economically important traits for genetic improvement in Hanwoo by using the real-time ultrasound measurements for longissimus dorsi muscle area (LMA), backfat thickness (BFT), and marbling score (Marb). The phenotypic data were obtained from 1,648 pedigreed cows, and general linear models were applied to test the effects of age, region, and body condition socre. The cows between 50 and 60 months of age had the greatest scores for LMA and BFT, and Marb (P<0.05). The cows in region C had the greatest scores for body condition socre, LMA and BFT, while in region J Marb was the lowest (P<0.05). There was positive relation with LMA, BFT, and Marb according to increase body condition socre. Heritabilities for LMA, BFT, and Marb were estimated as 0.136, 0.351, and 0.236, respectively. These results would provide primary information for the efficient implementation of genetic improvement schemes in Hanwoo.
Design and evaluation of a VPRS-based misbehavior detection scheme for VANETs
Kim, Chil-Hwa ; Bae, Ihn-Han ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1153~1166
Detecting misbehavior in vehicular ad-hoc networks is very important problem with wide range of implications including safety related and congestion avoidance applications. Most misbehavior detection schemes are concerned with detection of malicious nodes. In most situations, vehicles would send wrong information because of selfish reasons of their owners. Because of rational behavior, it is more important to detect false information than to identify misbehaving nodes. In this paper, we propose the variable precision rough sets based misbehavior detection scheme which detects false alert message and misbehaving nodes by observing their action after sending out the alert messages. In the proposed scheme, the alert information system, alert profile is constructed from valid actions of moving nodes in vehicular ad-hoc networks. Once a moving vehicle receives an alert message from another vehicle, it finds out the alert type from the alert message. When the vehicle later receives a beacon from alert raised vehicle after an elapse of time, then it computes the relative classification error by using variable precision rough sets from the alert information system. If the relative classification error is lager than the maximum allowable relative classification error of the alert type, the vehicle decides the message as false alert message. Th performance of the proposed scheme is evaluated as two metrics: correct ratio and incorrect ratio through a simulation.
The proposed algorithm for the student numbers in local government
Kim, Jong-Tae ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1167~1173
The goal of this paper is to suggest an algorithm to get forecasting for the numbers of students in the city or county in local government by using the double exponential smoothing method. By 2044 year, the third year of high school students in the Chilgok, Gumi, Gyeongsan, Andong, Pohang and Gimchen are reduced about 40-70%, the those of in the remaining city or county are reduced about 70-95%. In conclusion, the forecasting numbers of students of the 23 counties in Kyungbuk Province are on the decrease to 40%-100% until 2044 year in comparison with the numbers of students on 2010 years.
A simulation study for the approximate confidence intervals of hypergeometric parameter by using actual coverage probability
Kim, Dae-Hak ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1175~1182
In this paper, properties of exact confidence interval and some approximate confidence intervals of hyper-geometric parameter, that is the probability of success p in the population is discussed. Usually, binomial distribution is a well known discrete distribution with abundant usage. Hypergeometric distribution frequently replaces a binomial distribution when it is desirable to make allowance for the finiteness of the population size. For example, an application of the hypergeometric distribution arises in describing a probability model for the number of children attacked by an infectious disease, when a fixed number of them are exposed to it. Exact confidence interval estimation of hypergeometric parameter is reviewed. We consider the approximation of hypergeometirc distribution to the binomial and normal distribution respectively. Approximate confidence intervals based on these approximation are also adequately discussed. The performance of exact confidence interval estimates and approximate confidence intervals of hypergeometric parameter is compared in terms of actual coverage probability by small sample Monte Carlo simulation.
Excel macro for applying Bayes' rule
Kim, Jae-Hyun ; Baek, Hoh-Yoo ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1183~1197
The prior distribution is the probability distribution we have before observing data. Using Bayes' rule, we can compute the posterior distribution, the new probability distribution, after observing data. Computing the posterior distribution is much easier than before by using Excel VBA macro. In addition, we can conveniently compute the successive updating posterior distributions after observing the independent and sequential outcomes. In this paper we compose some Excel VBA macros for applying Bayes' rule and give some examples.
Herd behavior and volatility in financial markets
Park, Beum-Jo ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1199~1215
Relaxing an unrealistic assumption of a representative percolation model, this paper demonstrates that herd behavior leads to a high increase in volatility but not trading volume, in contrast with information flows that give rise to increases in both volatility and trading volume. Although detecting herd behavior has posed a great challenge due to its empirical difficulty, this paper proposes a new methodology for detecting trading days with herding. Furthermore, this paper suggests a herd-behavior-stochastic-volatility model, which accounts for herding in financial markets. Strong evidence in favor of the model specification over the standard stochastic volatility model is based on empirical application with high frequency data in the Korean equity market, strongly supporting the intuition that herd behavior causes excess volatility. In addition, this research indicates that strong persistence in volatility, which is a prevalent feature in financial markets, is likely attributed to herd behavior rather than news.
Exponential family of circular distributions
Kim, Sung-Su ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1217~1222
In this paper, we show that any circular density can be closely approximated by an exponential family of distributions. Therefore we propose an exponential family of distributions as a new family of circular distributions, which is absolutely suitable to model any shape of circular distributions. In this family of circular distributions, the trigonometric moments are found to be the uniformly minimum variance unbiased estimators (UMVUEs) of the parameters of distribution. Simulation result and goodness of fit test using an asymmetric real data set show usefulness of the novel circular distribution.
Estimation of error variance in nonparametric regression under a finite sample using ridge regression
Park, Chun-Gun ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1223~1232
Tong and Wang's estimator (2005) is a new approach to estimate the error variance using least squares method such that a simple linear regression is asymptotically derived from Rice's lag- estimator (1984). Their estimator highly depends on the setting of a regressor and weights in small sample sizes. In this article, we propose a new approach via a local quadratic approximation to set regressors in a small sample case. We estimate the error variance as the intercept using a ridge regression because the regressors have the problem of multicollinearity. From the small simulation study, the performance of our approach with some existing methods is better in small sample cases and comparable in large cases. More research is required on unequally spaced points.
Geometric ergodicity for the augmented asymmetric power GARCH model
Park, S. ; Kang, S. ; Kim, S. ; Lee, O. ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1233~1240
An augmented asymmetric power GARCH(p, q) process is considered and conditions for stationarity, geometric ergodicity and
-mixing property with exponential decay rate are obtained.
Noninformative priors for the common mean in log-normal distributions
Kang, Sang-Gil ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1241~1250
In this paper, we develop noninformative priors for the log-normal distributions when the parameter of interest is the common mean. We developed Jeffreys' prior, th reference priors and the first order matching priors. It turns out that the reference prior and Jeffreys' prior do not satisfy a first order matching criterion, and Jeffreys' pri the reference prior and the first order matching prior are different. Some simulation study is performed and a real example is given.
An approach to improving the Lindley estimator
Park, Tae-Ryoung ; Baek, Hoh-Yoo ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1251~1256
Consider a p-variate (
) normal distribution with mean
and identity covariance matrix. Using a simple property of noncentral chi square distribution, the generalized Bayes estimators dominating the Lindley estimator under quadratic loss are given based on the methods of Brown, Brewster and Zidek for estimating a normal variance. This result can be extended the cases where covariance matrix is completely unknown or
for an unknown scalar
A study on interaction effect among risk factors of delirium using multifactor dimensionality reduction method
Lee, Jong-Hyeong ; Lee, Yong-Won ; Lee, Yoon-Seok ; Lee, Jea-Young ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1257~1264
Delirium is a neuropsychiatric disorder accompanying symptoms of hallucination, drowsiness, and tremors. It has high occurrence rates among elders, heart disease patients, and burn patients. It is a medical emergency associated with increased morbidity and mortality rates. That s why early detection and prevention of delirium ar significantly important. And This mental illness like delirium occurred by complex interaction between risk factors. In this paper, we identify risk factors and interactions between these factors for delirium using multi-factor dimensionality reduction (MDR) method.
Objective Bayesian testing for the location parameters in the half-normal distributions
Kang, Sang-Gil ; Kim, Dal-Ho ; Lee, Woo-Dong ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1265~1273
This article deals with the problem of testing the equality of the location parameters in the half-normal distributions. We propose Bayesian hypothesis testing procedures for the equality of the location parameters under the noninformative prior. The non-informative prior is usually improper which yields a calibration problem that makes the Bayes factor to be defined up to arbitrary constants. This problem can be deal with the use of the fractional Bayes factor or intrinsic Bayes factor. So we propose the default Bayesian hypothesis testing procedures based on the fractional Bayes factor and the intrinsic Bayes factors under the reference priors. Simulation study and an example are provided.
On connected dominating set games
Kim, Hye-Kyung ;
Journal of the Korean Data and Information Science Society, volume 22, issue 6, 2011, Pages 1275~1281
Many authors studied cooperative games that arise from variants of dominating set games on graphs. In wireless networks, the connected dominating set is used to reduce routing table size and communication cost. In this paper, we introduce a connected dominating set game to model the cost allocation problem arising from a connected dominating set on a given graph and study its core. In addition, we give a polynomial time algorithm for determining the balancedness of the game on a tree, for finding a element of the core.