Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Journal of the Korean Data and Information Science Society
Journal Basic Information
Journal DOI :
Korean Data and Information Science Society
Editor in Chief :
Volume & Issues
Volume 22, Issue 6 - Dec 2011
Volume 22, Issue 5 - Oct 2011
Volume 22, Issue 4 - Jul 2011
Volume 22, Issue 3 - May 2011
Volume 22, Issue 2 - Mar 2011
Volume 22, Issue 1 - Jan 2011
Selecting the target year
A study using HGLM on regional difference of the dead due to injuries
Kim, Kil-Hun ; Noh, Maeng-Seok ; Ha, Il-Do ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 137~148
In this paper, we systematically investigate regional differences of the dead due to injuries in cities, towns and counties about transportation accidents, suicides and fall accidents, which have recently been an important issue of health problems in Korea, The data are from the Annual Report on the Cause of Death Statistics in Korea in 2008. They include the deaths over the age 19 from transportation accidents, suicides and fall accidents with the criterion of the International Statistical Classification of Diseases. Poisson HGLM is applied to estimate the mortality rate under the assumption that the number of deaths follow a Poisson distribution, by considering regions as random effects and by adjusting age, sex and standardized residence tax as fixed effects. Using the results of random effects prediction, the regional differences in cities, counties and towns are marked in disease mapping. The results showed that there were significant regional differences of mortality rates for transportation accidents and suicides, but no significant differences for fall accidents.
Undecided inference using logistic regression for credit evaluation
Hong, Chong-Sun ; Jung, Min-Sub ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 149~157
Undecided inference could be regarded as a missing data problem such as MARand MNAR. Under the assumption of MAR, undecided inference make use of logistic regression model. The probability of default for the undecided group is obtained with regression coefficient vectors for the decided group and compare with the probability of default for the decided group. And under the assumption of MNAR, undecide dinference make use of logistic regression model with additional feature random vector. Simulation results based on two kinds of real data are obtained and compared. It is found that the misclassification rates are not much different from the rate of rawdata under the assumption of MAR. However the misclassification rates under the assumption of MNAR are less than those under the assumption of MAR, and as the ratio of the undecided group is increasing, the misclassification rates is decreasing.
Study on development of an evaluation index for a department homepage
Choi, Seung-Bae ; Kang, Chang-Wan ; Cho, Jang-Sik ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 159~169
A department homepage contains various informations related with the department. So, a role of the department homepage is very important for both of students and applicants. For this reason, D university is evaluating the homepages of all departments in viewpoint of the construction, design, and contents of homepage. But the method of this homepage evaluation has some problems. Therefore, in this study, we propose a objective index for department homepage evaluation which can be applied to homepages of the other departments. Also we expect to provide an important information for the efficient homepage management by using the proposed index for homepage evaluation.
Estimation on composite lognormal-Pareto distribution based on doubly censored samples
Lee, Kwang-Ho ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 171~177
With the development of the actuarial and insurance industries, the distributions of the insurance payments data are deeply studied by many authors. It is known that theses types of distribution are very highly positively skewed and have a long thick upper tail such as Pareto or lognormal distribution. In 2005, Cooray and Ananda proposed a new model which is composed lognormal distribution and Pareto distribution. They said it as composite lognormal-Preto distribution. They showed that the proposed distribution was better fitted than lognormal or Pareto distribution. On the other hand many agreements about the insurance payment have some options for a trivially small payment or extremely large one because of the limits of total payment. Appling these cases, in this paper we consider the parameter estimation on the composite lognormal-Pareto distribution based on doubly censored samples.
Proposition of negatively pure association rule threshold
Park, Hee-Chang ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 179~188
Association rule represents the relationship between items in a massive database by quantifying their relationship, and is used most frequently in data mining techniques. In general, association rule technique generates the rule, 'If A, then B.', whereas negative association rule technique generates the rule, 'If A, then not B.', or 'If not A, then B.'. We can determine whether we promote other products in addition to promote its products only if we add negative association rules to existing association rules. In this paper, we proposed the negatively pure association rules by negatively pure support, negatively pure confidence, and negatively pure lift to overcome the problems faced by negative association rule technique. In checking the usefulness of this technique through numerical examples, we could find the direction of association by the sign of the negatively pure association rule measure.
Simulation comparison of standardization methods for interview scores
Park, Cheol-Yong ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 189~196
In this study, we perform a simulation study to compare frequently used standardization methods for interview scores based on trimmed mean, rank mean, and z-score mean. In this simulation study we assume that interviewer's score is influenced by a weighted average of true interviewee's true score and independent noise whose weight is determined by the professionality of the interviewer. In other words, as interviewer's professionality increases, the observed score becomes closer to the true score and if interviewer's professionality decreases, the observed score becomes closer to the noise instead of the true score. By adding interviewer's tendency bias to the weighed average, final interviewee's score is assumed to be observed. In this simulation, the interviewers's cores for each method are computed and then the method is considered best whose rank correlation between the method's scores and the true scores is highest. Simulation results show that when the true score is from normal distributions, z-score mean is best in general and when the true score is from Laplace distributions, z-score mean is better than rank mean in full interview system, where all interviewers meet all interviewees, and rank mean is better than z-score mean in half split interview system, where the interviewers meet only half of the interviewees. Trimmed mean is worst in general.
Online abnormal events detection with online support vector machine
Park, Hye-Jung ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 197~206
The ability to detect online abnormal events in signals is essential in many real-world signal processing applications. In order to detect abnormal events, previously known algorithms require an explicit signal statistical model, and interpret abnormal events as statistical model abrupt changes. In general, maximum likelihood and Bayesian estimation theory to estimate well as detection methods have been used. However, the above-mentioned methods for robust and tractable model, it is not easy to estimate. More freedom to estimate how the model is needed. In this paper, we investigate a machine learning, descriptor-based approach that does not require a explicit descriptors statistical model, based on support vector machines are known to be robust statistical models and a sequential optimal algorithm online support vector machine is introduced.
Information security risk: Application of the conjoint analysis
Pak, Ro-Jin ; Lee, Dong-Hoon ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 207~215
This Risk analysis on information related assets is conducted primarily according to the standards the Korea Information and Telecommunications Technology Association (TTA) or the International Organization for Standardization (ISO). The process is made of asset analysis, threat analysis, vulnerability analysis, and response plan analysis. The risk for information related assets belongs to the operational risks suggested by BIS (Bank for International Settlements) and the information related losses can be estimated in terms of BIS' suggestion. In this paper it is proposed that how to apply the method proposed by BIS to estimate the loss of information assets.
Odds ratio of major risk factors associated with delirium by Bayesian network
Lee, Jea-Young ; Choi, Young-Jin ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 217~225
It is important to find risk factors associated with mental disorder. Also the hazard ratio that represent the relationship of risk factors with illness is main interest in medicine. Thus we used odds ratio to explore the relationship between mental disorder and risk factors. On this paper, when we applied Bayesian network to delirium of mental disorder, we selected major risk factors and calculated odds ratio. Especially we identified odds ratio of single risk factors and multiple risk factors.
Order selection based on scaled lift
Park, Cheol-Yong ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 227~234
In this study, we propose order selection methods based on scaled lift. This study is proposed to overcome the problem that the lift used by Park and Kim (2010) takes unbounded values and thus it is hard to know how big (or small) lift value is big (or small). The first scaled lift just scales lift, so that it takes values between 0 and 1, and the second scaled lift scales lift-1, so that it takes values between -1 and 1. In other words, the first method scales lift only and the second methods ceters and scales lift. We apply order selection methods based on scaled lift to acute appendicitis patients in emergency room and compare them with the results based on lift.
The proposition of attributably pure confidence in association rule mining
Park, Hee-Chang ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 235~243
The most widely used data mining technique is to explore association rules. This technique has been used to find the relationship between each set of items based on the association thresholds such as support, confidence, lift, etc. There are many interestingness measures as the criteria for evaluating association rules. Among them, confidence is the most frequently used, but it has the drawback that it can not determine the direction of the association. The net confidence measure was developed to compensate for this drawback, but it is useless in the case that the value of positive confidence is the same as that of negative confidence. This paper propose a attributably pure confidence to evaluate association rules and then describe some properties for a proposed measure. The comparative studies with confidence, net confidence, and attributably pure confidence are shown by numerical example. The results show that the attributably pure confidence is better than confidence or net confidence.
Power analysis for
factorial in randomized complete block design
Choi, Young-Hun ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 245~253
Powers of rank transformed statistic for testing main effects and interaction effects for
factorial design in randomized complete block design are very superior to powers of parametric statistic without regard to the block size, composition method of effects and the type of population distributions such as exponential, double exponential, normal and uniform.
factorial design in RCBD increases error effects and decreases powers of parametric statistic which results in conservativeness. However powers of rank transformed statistic maintain relative preference. In general powers of rank transformed statistic show relative preference over those of parametric statistic with small block size and big effect size.
Finding the optimal frequency for trade and development of system trading strategies in futures market using dynamic time warping
Lee, Suk-Jun ; Oh, Kyong-Joo ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 255~267
The aim of this study is to utilize system trading for making investment decisions and use technical analysis and Dynamic Time Warping (DTW) to determine similar patterns in the frequency of stock data and ascertain the optimal timing for trade. The study will examine some of the most common patterns in the futures market and use DTW in terms of their frequency (10, 30, 60 minutes, and daily) to discover similar patterns. The recognized similar patterns were verified by executing trade simulation after applying specific strategies to the technical indicators. The most profitable strategies among the set of strategies applied to common patterns were again applied to the similar patterns and the results from DTW pattern recognition were examined. The outcome produced useful information on determining the optimal timing for trade by using DTW pattern recognition through system trading, and by applying distinct strategies depending on data frequency.
Design and evaluation of a cluster-based fuzzy cooperative caching method for MANETs
Lee, Eun-Ju ; Bae, Ihn-Han ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 269~285
Caching of frequently accessed data in mobile ad-hoc networks is a technique that can improve data access performance and availability. Cooperative caching, which allows sharing and coordination of cached data among several clients, can further enhance the potential of caching techniques. In this paper, we propose a cluster-based fuzzy cooperative caching method for mobile ad-hoc networks. The performance of the proposed caching method is evaluated through an analytical model and is compared to that of other cooperative caching methods.
A statistical analysis of the fat mass experimental data using random coefficient model
Jo, Jin-Nam ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 287~296
Thirty six female students participated in the experiment of the fat mass weight loss. they kept diary for foods they ate every day, took a picture of the foods, transmitted the picture to the experimenter by the camera phone, and consulted him about fat mass loss once a week for 8 weeks period. Fat mass weight and its related factors of the students had been measured repeatedly every week during 8 weeks, The repeated measurement data were used for applying various random coefficient models. And hence optimal random coefficient model was selected. From the optimal model, the baseline, body mass index, diastolic blood pressure, total cholesterol and time of the fixed factors were very significant. The fixed quadratic time effect existed. The variance components corresponding to the subject effect, linear time effect of the random coefficients were all positive. Thus random coefficients up to the linear terms were considered as the optimal model. The treatment effect reduced the weight loss to an average of 2.1kg at the end of the period.
The analytic study and trends in mathematics achievement scores of the NAEA and mathematics item scores of the CSAT in 5 metroplitan cities
Suh, Bo-Euk ; Oh, Kwang-Sik ; Kim, Hye-Kyung ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 297~311
This study is about research into 'National Assessment of Educational Achievement (NAEA)' and 'College Scholastic Ability Test (CSAT)' in 5 metropolitan cities and analysis of the result. The objects of study are the materials of mathematics grade of NAEA from 2003 to 2009 and mathematics items of CSAT from 1994 to 2010. The contents of the study are followings. First, the trends of mathematics score of NAEA in elementary, middle and high school is analyzed according to gender, establishment type and the type of school. Second, the trends of CSAT is analyzed according to gender and line (cultural science line or natural science line). Also the trends of application number for cultural science line and natural science line. Third, the trends of the achievement test score of 6th grade in elementary school, third grade in middle school, first grade in high school and the CSAT score in third grade in high school for the same group is considered.
A study on the impact of stress on self-confidence in job-seeking through structural equation modelling
Choi, Hyun-Seok ; Lee, Yeong-Seon ; Ha, Jeong-Cheol ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 313~322
To enhance employment rate of students, every college provides various methods, such as consulting, information provision, course and mock interview. Students also are trying to get high GPA scores, to achieve certificates and to enhance english ability for getting jobs. We analyzed the relations self-confidence in getting jobs, college supports, students preparation for getting jobs and job-seeking stress via structural equation modelling. We found that college supports increase students preparation for getting jobs and job-seeking stress lessens self-confidence in getting jobs.
A study on the success factors of EDI information system: Focused on medical industry
Jo, Hyeon ; Kim, Soung-Hie ; Lee, Seok-Kee ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 323~333
Recently, Electronic Data Interchange (EDI) systems are widely used. Many organizations exchange the documents by using EDI, and EDI is used in various industries such as physical distribution, export and import business, custom, and medical care. However, there have been little attempt to empirically investigate the success factors of EDI system. In this paper, we identify the influencing factors which determine the success of typical information system and then suggest the additional factors characterized to the EDI system. Our research model, mainly based on the system success model, is tested by analyzing the empirical data acquired from the companies in the medical industry. As a result, the system quality, information quality, service quality, and the perceived sacrifice turned out to be significant to the success of EDI system while data security does not. The result of this research is likely to help providing useful guidelines for the successful EDI implementations.
Stock market stability index via linear and neural network autoregressive model
Oh, Kyung-Joo ; Kim, Tae-Yoon ; Jung, Ki-Woong ; Kim, Chi-Ho ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 335~351
In order to resolve data scarcity problem related to crisis, Oh and Kim (2007) proposed to use stability oriented approach which focuses a base period of financial market, fits asymptotic stationary autoregressive model to the base period and then compares the fitted model with the current market situation. Based on such approach, they developed financial market instability index. However, since neural network, their major tool, depends on the base period too heavily, their instability index tends to suffer from inaccuracy. In this study, we consider linear asymptotic stationary autoregressive model and neural network to fit the base period and produce two instability indexes independently. Then the two indexes are combined into one integrated instability index via newly proposed combining method. It turns out that the combined instability performs reliably well.
On geometric ergodicity and
-mixing property of asymmetric power transformed threshold GARCH(1,1) process
Lee, Oe-Sook ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 353~360
We consider an asymmetric power transformed threshold GARCH(1.1) process and find sufficient conditions for the existence of a strictly stationary solution, geometric ergodicity and
-mixing property. Moments conditions are given. Box-Cox transformed threshold GARCH(1.1) is also considered as a special case.
Noninformative priors for the reliability function of two-parameter exponential distribution
Kang, Sang-Gil ; Kim, Dal-Ho ; Lee, Woo-Dong ;
Journal of the Korean Data and Information Science Society, volume 22, issue 2, 2011, Pages 361~369
In this paper, we develop the reference and the matching priors for the reliability function of two-parameter exponential distribution. We derive the reference priors and the matching prior, and prove the propriety of joint posterior distribution under the general prior including the reference priors and the matching prior. Through the sim-ulation study, we show that the proposed reference priors match the target coverage probabilities in a frequentist sense.