Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Journal of the Korean Data and Information Science Society
Journal Basic Information
Journal DOI :
Korean Data and Information Science Society
Editor in Chief :
Volume & Issues
Volume 25, Issue 6 - Nov 2014
Volume 25, Issue 5 - Sep 2014
Volume 25, Issue 4 - Jul 2014
Volume 25, Issue 3 - May 2014
Volume 25, Issue 2 - Mar 2014
Volume 25, Issue 1 - Jan 2014
Selecting the target year
Selecting order of priority using Delphi and statistical method
Choi, Kyoungho ; Kim, Hyun ; Song, Mi-Jang ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1161~1170
DOI : 10.7465/jkdi.2014.25.6.1161
In global competition like today, intellectual property of novel areas such as traditional knowledge, traditional creation, hereditary resource, etc. are perceived as important resources. Therefore making solid competitive power in overall knowledge resources like cultural contents, brand, design etc. in nation is a pressing question. Accordingly in this study, to prepare for intellectual property rights dispute and advantage-sharing problem that would be variously derived from the Nagoya Protocol which will come into force after 2014, this research selected 200 knowledge of middle region in Korea from 2,047 literal and 931 oral knowledge using preconditioning process. The order of priority of top 50 of them was ranked by a quantitative research method, the Delphi survey. Among them, 30 was literal traditional knowledge and 20 was oral traditional knowledge. Result of this research could be used later as basic material for qualitative researches like the focus group interviewing. Furthermore in this paper is meaningful; the selected traditional knowledge can contribute remarkably to traditional biologic knowledge resource in nation which can be acknowledged in international society, announcing validity (hold precedence for patent) later on.
Saddlepoint approximations for the risk measures of portfolios based on skew-normal risk factors
Yu, Hye-Kyung ; Na, Jong-Hwa ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1171~1180
DOI : 10.7465/jkdi.2014.25.6.1171
We considered saddlepoint approximations to VaR (value at risk) and ES (expected shortfall) which frequently encountered in finance and insurance as the measures of risk management. In this paper we supposed univariate and multivariate skew-normal distributions, instead of traditional normal class distributions, as underlying distribution of linear portfolios. Simulation results are provided and showed the suggested saddlepoint approximations are very accurate than normal approximations.
Statistical procedures of add-on trials for bioequivalence in 2×k crossover designs
Woo, Hwahyoung ; Park, Sang-Gue ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1181~1193
DOI : 10.7465/jkdi.2014.25.6.1181
Currently Ministry of food and drug safety allows add-on trial when the bioequivalence between two drugs fails to show since July 1, 2008. However, bioequivalence of highly variable drugs based on
crossover designs would require too many subjects, so the alternative designs like
crossover experiments are preferred. In this paper, we propose and discuss the statistical procedures for add-on trials in
Power study for 2 × 2 factorial design in 4 × 4 latin square design
Choi, Young Hun ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1195~1205
DOI : 10.7465/jkdi.2014.25.6.1195
Compared with single design, powers of rank transformed statistic for testing main and interaction effects for
latin square design are rapidly increased as effect size and replication size are increased. In general powers of rank transformed statistic are superior without regard to the diversified effect composition and the type of error distributions as nontesting factors are few and effect size are small. Powers of rank transformed statistic show much higher level than those of parametric statistic in exponential and double exponential distributions. Further powers of rank transformed statistic are very similar with those of parametric statistic in normal and uniform distributions.
Face/non-face channel fit comparison of life insurance company and non-life insurance company using social network analysis
Chun, Heuiju ; Leem, Byunghak ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1207~1219
DOI : 10.7465/jkdi.2014.25.6.1207
In this study, 1) we compare face channel and non-face channel of life insurance company and non-life insurance company with insurance employs' suitability opinion about channel type, channel property, channel evaluation items requiring when selling insurance products, 2) we construct two social networks for life insurance companies and non-life insurance companies and find/compare two networks' properties, and then want to suggest any direction about sale channel strategy. As the result of comparing social networks of life insurance company and non-life insurance company created by insurance selling channel fit evaluation, employs of life insurance companies have more common opinion than those of non-life insurance companies and so can have more same directional channel strategy. However, property insurance companies need to manage their own channel strategy based on their own circumstance.
Prediction of K-league soccer scores using bivariate Poisson distributions
Lee, Jang Taek ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1221~1229
DOI : 10.7465/jkdi.2014.25.6.1221
In this paper we choose the best model among several bivariate Poisson models on Korean soccer data. The models considered allow for correlation between the number of goals of two competing teams. We use an R package called bivpois for bivariate Poisson regression models and the data of K-league for season 1983-2012. Finally we conclude that the best fitted model supported by the AIC and BIC is the bivariate Poisson model with constant covariance. The zero and diagonal inflated models did not improve the model fit. The model can be used to examine home-away effect, goodness of fit, attack and defense parameters.
Soccer goal distributions in K-league
Lee, Jang Taek ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1231~1239
DOI : 10.7465/jkdi.2014.25.6.1231
In this paper we analyse the distributions of the number of goals scored by home teams and away teams in K-league soccer outcomes between 1983 and 2012. Real soccer data is explained in K-league using statistical distributions such that Poisson, negative binomial, extreme value and zero inflated Poisson. How close the goals of home and away fits the different distributions are tested by performing chi-square goodness of fit tests. According to these tests, the Poisson distribution gives the best fit to the home goals data. But it is best to model the away goals data on zero inflated Poisson distribution. Also, there is some weak evidence of the dependence for home and away goals.
Evaluation of research performances for 28 national universities
Jeong, Dong Bin ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1241~1251
DOI : 10.7465/jkdi.2014.25.6.1241
Based on the 4 principal research-performance criteria in 28 national universities in Korea, both cluster analysis and multidimensional scaling are performed in this paper. We can classify and/or specialize the initially unknown groups into a group of relatively homogeneous universities and then create new groupings without any preconceived notion of what clusters may arise. Furthermore, the level of similarity of individual universities can be visualized on the multidimensional space so that each university is then assigned coordinates in each of the 2 dimensions. Both types and characteristics of each university can be relatively evaluated and be practically exploited for the policy of the university authority through these results.
Panel data analysis with regression trees
Chang, Youngjae ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1253~1262
DOI : 10.7465/jkdi.2014.25.6.1253
Regression tree is a tree-structured solution in which a simple regression model is fitted to the data in each node made by recursive partitioning of predictor space. There have been many efforts to apply tree algorithms to various regression problems like logistic regression and quantile regression. Recently, algorithms have been expanded to the panel data analysis such as RE-EM algorithm by Sela and Simonoff (2012), and extension of GUIDE by Loh and Zheng (2013). The algorithms are briefly introduced and prediction accuracy of three methods are compared in this paper. In general, RE-EM shows good prediction accuracy with least MSE's in the simulation study. A RE-EM tree fitted to business survey index (BSI) panel data shows that sales BSI is the main factor which affects business entrepreneurs' economic sentiment. The economic sentiment BSI of non-manufacturing industries is higher than that of manufacturing ones among the relatively high sales group.
Review and discussion of marginalized random effects models
Jeon, Joo Yeong ; Lee, Keunbaik ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1263~1272
DOI : 10.7465/jkdi.2014.25.6.1263
Longitudinal categorical data commonly occur from medical, health, and social sciences. In these data, the correlation of repeated outcomes is taken into account to explain the effects of covariates exactly. In this paper, we introduce marginalized random effects models that are used for the estimation of the population-averaged effects of covariates. We also review how these models have been developed. Real data analysis is presented using the marginalized random effects.
Statistical analysis of life pattern and functional cosmetics awareness
Shin, Jae-Kyoung ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1273~1281
DOI : 10.7465/jkdi.2014.25.6.1273
As a variety of industrial technology advances with various materials developed, cosmetics industry in general is witnessing many kinds of new cosmetic products. At the same time, Cosmetics Act becomes effective, and cosmetics ingredient labeling is implemented. Concerns with life pattern such as life management, biorhythm, right posture, and stress symptoms are on the increase in order to seek high-quality life as well. These life patterns are closely connected not only with life quality and health but with skin management, and thus this paper conducts a survey on the connection between functional cosmetics awareness and life pattern. Results show that out of 16 questions concerning cosmetics awareness, responses to question number 3, 5, 6 and 11 have relevant differences among colleges. Furthermore, the results of cross analysis with life pattern show that there are relevant differences between the year of responses and stress symptoms and between the year of responses and right posture. Lastly, it is shown that answers to questions regarding gender and biorhythm, gender and right posture, and gender and stress symptoms are relevantly different. Further research is needed to reveal the differences between college students and ordinary people at large.
Classification of large-scale data and data batch stream with forward stagewise algorithm
Yoon, Young Joo ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1283~1291
DOI : 10.7465/jkdi.2014.25.6.1283
In this paper, we propose forward stagewise algorithm when data are very large or coming in batches sequentially over time. In this situation, ordinary boosting algorithm for large scale data and data batch stream may be greedy and have worse performance with class noise situations. To overcome those and apply to large scale data or data batch stream, we modify the forward stagewise algorithm. This algorithm has better results for both large scale data and data batch stream with or without concept drift on simulated data and real data sets than boosting algorithms.
Analysis of statistical models on temperature at the Seosan city in Korea
Lee, Hoonja ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1293~1300
DOI : 10.7465/jkdi.2014.25.6.1293
The temperature data influences on various policies of the country. In this article, the autoregressive error (ARE) model has been considered for analyzing the monthly and seasonal temperature data at the northern part of the Chungcheong Namdo, Seosan monitoring site in Korea. In the ARE model, five meteorological variables, four greenhouse gas variables and five pollution variables are used as the explanatory variables for the temperature data set. The five meteorological variables are wind speed, rainfall, radiation, amount of cloud, and relative humidity. The four greenhouse gas variables are carbon dioxide (
), methane (
), nitrous oxide (
), and chlorofluorocarbon (
). And the five air pollution explanatory variables are particulate matter (
), sulfur dioxide (
), nitrogen dioxide (
), ozone (
), and carbon monoxide (CO). The result showed that the monthly ARE model explained about 39-63% for describing the temperature. However, the ARE model will be expected better when we add the more explanatory variables in the model.
A study on library users' loyalty with users' satisfaction as a moderating variable: K university case
Choi, Hyun Seok ; Park, Cheolyong ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1301~1313
DOI : 10.7465/jkdi.2014.25.6.1301
Universities are places where the knowledge and information of various areas is created and shared, and the libraries of the universities are main places for providing education services. In this study, we survey the students of K university on the satisfactions on offering contents, environment/facilities and library service, users' satisfaction, and users' loyalty. Structural equation models are used in order to verify whether the users' satisfactions on offering contents, environment/facilities and library service are both directly and indirectly effective on users' loyalty with users' satisfaction as a moderating variable. The results show that the satisfaction on offering contents is both directly and indirectly effective on the loyalty and that the satisfactions on environment/facilities and library service are not indirectly effective but directly effective on the loyalty.
The estimation of lifetime income replacement rates
Shin, Seunghee ; Son, Hyunsub ; Lee, Hangsuck ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1315~1331
DOI : 10.7465/jkdi.2014.25.6.1315
Replacement rates which shows the ratio of retirement income relative to preretirement income is used for a valuable evaluation measures while discussing social security benefit levels or the adequacy of retirement income. However, replacement rates has been only used for an index for evaluating benefit level at the time of retirement or specific retirement period in advanced research projects. This article analyzes how much the uncertainty of survival has an influence on retirement income, and shows replacement rates in conformity with the period of survival as an index. The researchers named this index lifetime income replacement rates. Analysis based on this index shows both life replacement income rates of 38.3% in men's case and of 41.1% in women's case while enrolled for 20years in three pension plans - national pension, retirement pension and individual annuity.
On multivariate GARCH model selection based on risk management
Park, SeRin ; Baek, Changryong ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1333~1343
DOI : 10.7465/jkdi.2014.25.6.1333
Hansen and Lund (2005) documented that a univariate GARCH(1,1) model is no worse than other sophisticated GARCH models in terms of prediction errors such as MSPE and MAE. Here, we extend Hansen and Lund (2005) by considering multivariate GARCH models and incorporating risk management measures such as VaR and fail percentage. Our Monte Carlo simulations study shows that multivariate GARCH(1,1) model also performs well compared to asymmetric GARCH models. However, we suggest that actual model selection should be done with care in light of risk management. It is applied to the realized volatilities of KOSPI, NASDAQ and HANG SENG index for recent 10 years.
Development of association rule threshold by balancing of relative rule accuracy
Park, Hee Chang ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1345~1352
DOI : 10.7465/jkdi.2014.25.6.1345
Data mining is the representative methodology to obtain meaningful information in the era of big data.By Wikipedia, association rule learning is a popular and well researched method for discovering interesting relationship between itemsets in large databases using association thresholds. It is intended to identify strong rules discovered in databases using different interestingness measures. Unlike general association rule, inverse association rule mining finds the rules that a special item does not occur if an item does not occur. If two types of association rule can be simultaneously considered, we can obtain the marketing information for some related products as well as the information of specific product marketing. In this paper, we propose a balanced attributable relative accuracy applicable to these association rule techniques, and then check the three conditions of interestingness measures by Piatetsky-Shapiro (1991). The comparative studies with rule accuracy, relative accuracy, attributable relative accuracy, and balanced attributable relative accuracy are shown by numerical example. The results show that balanced attributable relative accuracy is better than any other accuracy measures.
The effects of online nursing education contents on self efficacy, knowledge, and performance of nursing skills
Nam, Hyea Sook ; Son, Kyeong Ae ; Kim, Su Hyun ; Song, Yeoungsuk ; Kwon, So-Hi ; Oh, Eun Hee ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1353~1360
DOI : 10.7465/jkdi.2014.25.6.1353
This study aimed to evaluate the effects of the nursing skills program, which offers online access to evidence-based skills and procedures. The nursing skills tested in this study was tracheostomy suctioning in the Mosby's Nursing Skills. The design of the study was a control group non-synchronized pre-posttest quasi-experimental research. The experimental group who utilized the Mosby's Nursing Skills had significantly higher level of knowledge and skills of trachosotmy suctioning, but not of self-efficacy. Online accessible nursing skills program was shown to be effective in improving nursing skills of students, and it is suggested to utilize the program in nursing practicum.
An analysis on the change rate of housing rent price index
Yeon, Kyu Pil ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1361~1369
DOI : 10.7465/jkdi.2014.25.6.1361
This research is for analyzing the change rate of housing rent price index produced by KAB (Korea Appraisal Board) in the monthly periodical, Survey on Housing Monthly Rent. The index is a very important and useful indicator to understand and diagnose the house rental market. However, the index is criticized in that it tends to decline when the price level of Jeonse (i.e., a typical type of dwellings in Korea, generally leased on a deposit basis for 1 or 2 years) is highly going up, which is inconsistent with the actual economic sentiment of tenants. We verify the reason why such phenomenon occurs and suggest a simple but novel method to analyze properly the change rate of the index. The main findings are as follows. The key factor to trigger the problem is the use of the conversion rate for Jeonse-to-monthly rent for constructing the rent price indexes. We separate the effect of the conversion rate out of the change rate of the index and quantify the adjusted real change rate showing an increase of the rent price level which is masked by the conversion rate before.
Comparison of ensemble pruning methods using Lasso-bagging and WAVE-bagging
Kwak, Seungwoo ; Kim, Hyunjoong ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1371~1383
DOI : 10.7465/jkdi.2014.25.6.1371
Classification ensemble technique is a method to combine diverse classifiers to enhance the accuracy of the classification. It is known that an ensemble method is successful when the classifiers that participate in the ensemble are accurate and diverse. However, it is common that an ensemble includes less accurate and similar classifiers as well as accurate and diverse ones. Ensemble pruning method is developed to construct an ensemble of classifiers by choosing accurate and diverse classifiers only. In this article, we proposed an ensemble pruning method called WAVE-bagging. We also compared the results of WAVE-bagging with that of the existing pruning method called Lasso-bagging. We showed that WAVE-bagging method performed better than Lasso-bagging by the extensive empirical comparison using 26 real dataset.
Major gene identification for FASN gene in Korean cattles by data mining
Kim, Byung-Doo ; Kim, Hyun-Ji ; Lee, Seong-Won ; Lee, Jea-Young ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1385~1395
DOI : 10.7465/jkdi.2014.25.6.1385
Economic traits of livestock are affected by environmental factors and genetic factors. In addition, it is not affected by one gene, but is affected by interaction of genes. We used a linear regression model in order to adjust environmental factors. And, in order to identify gene-gene interaction effect, we applied data mining techniques such as neural network, logistic regression, CART and C5.0 using five-SNPs (single nucleotide polymorphism) of FASN (fatty acid synthase). We divided total data into training (60%) and testing (40%) data, and applied the model which was designed by training data to testing data. By the comparison of prediction accuracy, C5.0 was identified as the best model. It were selected superior genotype using the decision tree.
The unit-nonresponse status and use of weight in the KCYPS
Lee, Hwa-Jung ; Kang, Suk-Bok ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1397~1405
DOI : 10.7465/jkdi.2014.25.6.1397
Usually unit-nonresponse or item-nonresponse occurs in the survey. In case the rate of nonresponse is high, the analysis ignoring nonresponse may cause the wrong effect. The characterization of nonresponse is required. In a cross-sectional data, it is possible to study the characteristics of item-nonresponse but it is hard to study the characteristics of the unit-nonresponse. In order to identify the characteristics of the unit-nonresponse, this study used the first-year student of middle schools in the Korea children and youth panel survey (KCYPS) data. We investigated the handling situation of nonresponse and analyzed the characteristics of the unit-nonresponse. Most of the papers applied the way of getting rid of nonresponse, so that there was little paper using weights. In this paper, we compared the results of the analyses depending on whether the weight is used or not. The method of using weights showed statistically significant results much more than that of removing. More discussion will be needed.
The influence of parents conflict on youth's anxiety and school adaptation
Min, Dae Kee ; Choi, Mi-Kyung ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1407~1418
DOI : 10.7465/jkdi.2014.25.6.1407
Korean youth spend tremendous time in school for preparing for college admissions. Their academic achievement and overall satisfaction with their lives are affected by how well they adapt to life in school. Successful adaptation to school is important enough to affect a student's future social life. One of the factors that affect adaptation to school is the psychological condition of adolescent anxiety. Anxiety is one of the common mental disorders that appear in people who are not familiar with new environments. Anxiety is known to be related to behavioral problems, and problems with psychological and emotional adaptation. This condition is dramatically increased in adolescents.Parental conflict in particular is known to be a major factor in affecting youth anxiety. As parental conflict became more severe, children felt more negative emotions such as anger, sadness and worry. Moreover, when a child's issue caused the parental conflict, there were more side effects in the emotional condition of the child. This study shows how parental conflict affects a child's anxiety and a child's school life.This problem is analyzed through structural equation modeling.
On the development of DES encryption based on Excel Macro
Kim, Daehak ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1419~1429
DOI : 10.7465/jkdi.2014.25.6.1419
In this paper, we consider the development of encryption of DES (data encryption standard) based on Microsoft Excel Macro, which was adopted as the FIPS (federal information processing standard) 46 of USA in 1977. Concrete explanation of DES is given. Algorithms for DES encryption are adapted to Excel Macro. By repeating the 16 round which is consisted of diffusion (which hide the relation between plain text and cipher text) and the confusion (which hide the relation between cipher key and cipher text) with Excel Macro, we can easily get the desired DES cipher text.
Structuring of unstructured big data and visual interpretation
Lee, Kyeongjun ; Noh, Yunhwan ; Yoon, Sanggyeong ; Cho, Youngseuk ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1431~1438
DOI : 10.7465/jkdi.2014.25.6.1431
We analyzed the articles from "Kukje Shinmun" and "Busan Ilbo", which are two local newpapers of Busan Metropolitan City. The articles cover from January 1, 2013 to December 31, 2013. Meaningful pattern inherent in 2889 articles of which the title includes "Busan" and "Traffic" and related data was analyzed. Textmining method, which is a part of datamining, was used for the social network analysis (SNA). HDFS and MapReduce (from Hadoop ecosystem), which is open-source framework based on JAVA, were used with Linux environment (Uubntu-12.04LTS) for the construction of unstructured data and the storage, process and the analysis of big data. We implemented new algorithm that shows better visualization compared with the default one from R package, by providing the color and thickness based on the weight from each node and line connecting the nodes.
A longitudinal study for child aggression with Korea Welfare Panel Study data
Choi, Nayeon ; Huh, Jib ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1439~1447
DOI : 10.7465/jkdi.2014.25.6.1439
Most of literatures on Korean child aggression are based on using the cross-sectional data sets. Although there is a related study with a longitudinal data set, it is assumed that the data sets measured repeatedly in the longitudinal data are mutually independent. A longitudinal data analysis for Korean child aggression is then necessary. This study is to analyze the effect of child development outcomes including academic achievement, self-esteem, depression anxiety, delinquency, victimization by peers, abuse by parents and internet using time on child aggression with Korea Welfare Panel Study data observed three times between 2006 and 2012. Since Korea Welfare Panel Study data have missing values, the missing at random is assumed. The linear mixed effect model and the restricted maximum likelihood estimation are considered.
Hedging effectiveness of KOSPI200 index futures through VECM-CC-GARCH model
Kwon, Dongan ; Lee, Taewook ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1449~1466
DOI : 10.7465/jkdi.2014.25.6.1449
In this paper, we consider a hedge portfolio based on futures of underlying asset. A classical way to estimate a hedge ratio for a hedge portfolio of a spot and futures is a regression analysis. However, a regression analysis is not capable of reflecting long-run equilibrium between a spot and futures and volatility clustering in the conditional variance of financial time series. In order to overcome such defects, we analyzed KOSPI200 index and futures using VECM-CC-GARCH model and computed a hedge ratio from the estimated conditional covariance-variance matrix. In real data analysis, we compared a regression and VECM-CC-GARCH models in terms of hedge effectiveness based on variance, value at risk and expected shortfall of log-returns of hedge portfolio. The empirical results show that the multivariate GARCH models significantly outperform a regression analysis and improve hedging effectiveness in the period of high volatility.
Computer intensive method for extended Euclidean algorithm
Kim, Daehak ; Oh, Kwang Sik ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1467~1474
DOI : 10.7465/jkdi.2014.25.6.1467
In this paper, we consider the two computer intensive methods for extended Euclidean algdrithm. Two methods we propose are C-programming based approach and Microsoft excel based method, respectively. Thses methods are applied to the derivation of greatest commnon devisor, multiplicative inverse for modular operation and the solution of diophantine equation. Concrete investigation for extended Euclidean algorithm with the computer intensive process is given. For the application of extended Euclidean algorithm, we consider the RSA encrytion method which is still popular recently.
Association analysis of admission factors and academic achievement
Ko, Jeong Hwan ; Song, Joon Hyub ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1475~1480
DOI : 10.7465/jkdi.2014.25.6.1475
This article analyzes the academic achievement of students who entered A university from 2011 to 2012 using grade point average (GPA). The purpose of this analysis is to find the relationship between admission factors and academic achievement. Contrary to our expectation, GPA of student selected from KSAT is higher than that of selected from CSAT. So, designing and running university entrance type, it is necessary to consider admission factors deliberately.
A study on prediction for attendances of Korean probaseball games using covariates
Han, Ga-Hee ; Chung, Jigyu ; Yoo, Jae Keun ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1481~1489
DOI : 10.7465/jkdi.2014.25.6.1481
For predicting yearly total attendances in Korean probaseball games, ARIMA models have been widely adopted so far. In this paper, we discuss two other ways of ARIMAX and growth curves with an exogenous variable to predict the attendances. By using the exogenous variable, it turns out that the prediction has been improved compared to ARIMA. It is concluded that various statistical methods must be considered for better prediction, and its results can be applied to predict the attendances of other pro sports.
Selection of extra support points for polynomial regression
Kim, Young-Il ; Jang, Dae-Heung ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1491~1498
DOI : 10.7465/jkdi.2014.25.6.1491
The major criticism of optimal experimental design is that it depends heavily on the model and its accompanying assumption that often leads the number of support points equal to the number of parameters in the model. Often in the past, a polynomial model of higher degree is assumed to handle the experimental design for the polynomial regression of lower degree. In this paper we searched the possible set of designs which are robust to the departure of the assumed model. The designs are categorized with respect to D-efficiency. The approach by O'Brien (1995) was discussed in univariate polynomial regression model setting.
A Wilcoxon signed-rank test for random walk hypothesis based on slopes
Kim, Tae Yoon ; Park, Cheolyong ; Kim, Seul Gee ; Kim, Min Seok ; Lee, Woo Jung ; Kwon, Yunji ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1499~1506
DOI : 10.7465/jkdi.2014.25.6.1499
Random walk is used for describing random phenomenon in various areas but tests for random walk developed so far are known to suffer from size distortion and low power. Kim et al. (2014) proposed a sign test for unit root (
) hypothesis based on slopes. This article proposes a Wilcoxon signed rank test based on slopes for unit root hypothesis, and compares it with the augmented Dickey-Fuller test and the sign test by a simulation study. Our results confirm that the nonparametric tests are better than ADF test for small samples like n = 30. The results also show that the sign test is better than the Wilcoxon signed rank test and that for 0 <
< 1 (-1 <
< 0), the nonparametric tests suffer from power loss (improvement) as normal error changes to double exponential error.
Volatility by the level of interest rate and RBC
An, Junyong ; Lee, Hangsuck ; Ju, Hyo Chan ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1507~1520
DOI : 10.7465/jkdi.2014.25.6.1507
In this paper, we show that there is a positive correlation between the level and the volatility of interest rate and thus suggest that a proper interest rate volatility coefficient (IRVC), a factor used in evaluating the interest rate risk that insurers are exposed to, should be chosen in accordance with the level of interest rate. To this end, we calculate the historical volatility of interest rate using data on government bond yields and show a proportionate relationship between interest rate and historical volatility. The review of exponential Vasicek (EV) and Cox-Ingersoll-Ross (CIR) models for interest rate also confirms the positive correlation between them. The estimation of IRVC by EV and CIR models are 0.9 and 1.1, respectively, which are much smaller than the one under the current risk-based capital (RBC) requirement. We provide modified IRVCs reflecting the level of interest by the two interest rate models. Using modified IRVCs can be a more reasonable method to evaluate the interest rate risk that insurers face.
Alternative accuracy for multiple ROC analysis
Hong, Chong Sun ; Wu, Zhi Qiang ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1521~1530
DOI : 10.7465/jkdi.2014.25.6.1521
The ROC analysis is considered for multiple class diagnosis. There exist many criteria to find optimal thresholds and measure the accuracy of diagnostic tests for k dimensional ROC analysis. In this paper, we proposed a diagnostic accuracy measure called the correct classification simple rate, which is defined as the summation of true rates for each classification distribution and expressed as a function of summation of sequential true rates for two consecutive distributions. This measure does not weight accuracy across categories by the category prevalence and is comparable across populations for multiple class diagnosis. It is found that this accuracy measure does not only have a relationship with Kolmogorov - Smirnov statistics, but also can be represented as a linear function of some optimal threshold criteria. With these facts, the suggested measure could be applied to test for comparing multiple distributions.
Visualization and interpretation of cancer data using linked micromap plots
Park, Se Jin ; Ahn, Jeong Yong ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1531~1538
DOI : 10.7465/jkdi.2014.25.6.1531
The causes of cancer are diverse, complex, and only partially understood. Many factors including health behaviors, socioeconomic environments and geographical locations can directly damage genes or combine with existing genetic faults within cells to cause cancerous mutations. Collecting the cancer data and reporting the statistics, therefore, are important to help identify health trends and establish normal health changes in geographical areas. In this article, we analyzed cancer data and demon-strated how spatial patterns of the age-standardized rate and health indicators can be examined visually and simultaneously using linked micromap plots. As a result of data analysis, the age-standardized rate has positive correlativity with thyroid and breast cancer, but the rate has negative correlativity with smoking and drinking. In addition, the regions with high age-standardized rate are located in southwest and the areas of high population density while the standardized mortality ratio is higher in southwest and northeast where there are lots of rural areas.
Support vector quantile regression for autoregressive data
Hwang, Hyungtae ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1539~1547
DOI : 10.7465/jkdi.2014.25.6.1539
In this paper we apply the autoregressive process to the nonlinear quantile regression in order to infer nonlinear quantile regression models for the autocorrelated data. We propose a kernel method for the autoregressive data which estimates the nonlinear quantile regression function by kernel machines. Artificial and real examples are provided to indicate the usefulness of the proposed method for the estimation of quantile regression function in the presence of autocorrelation between data.
An approach to improving the James-Stein estimator shrinking towards projection vectors
Park, Tae Ryong ; Baek, Hoh Yoo ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1549~1555
DOI : 10.7465/jkdi.2014.25.6.1549
Consider a p-variate normal distribution (
, q = rank(
) with a projection matrix
). Using a simple property of noncentral chi square distribution, the generalized Bayes estimators dominating the James-Stein estimator shrinking towards projection vectors under quadratic loss are given based on the methods of Brown, Brewster and Zidek for estimating a normal variance. This result can be extended the cases where covariance matrix is completely unknown or
for an unknown scalar
Can a securities law improve investor rationality in processing earnings information?
Kwag, Seung Woog ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1557~1567
DOI : 10.7465/jkdi.2014.25.6.1557
In this paper, I propose a general hypothesis that after the enactment of the Sarbanes-Oxley Act (SOA) financial statements convey more accurate and reliable corporate information to investors who in turn reflect such improvements in stock prices and test four practical hypotheses that simultaneously feature the degree of information asymmetry, forecast bias, and investor reaction to biased earnings information. The empirical results unanimously suggest that the post-SOA investors take advantage of the improvement in informational efficiency and accuracy and actively adjust for analyst forecast bias in earnings forecasts. The SOA indeed appears to achieve its primary goal of investor protection.
Default Bayesian testing for the equality of shape parameters in the inverse Weibull distributions
Kang, Sang Gil ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1569~1579
DOI : 10.7465/jkdi.2014.25.6.1569
This article deals with the problem of testing for the equality of the shape parameters in two inverse Weibull distributions. We propose Bayesian hypothesis testing procedures for the equality of the shape parameters under the noninformative prior. The noninformative prior is usually improper which yields a calibration problem that makes the Bayes factor to be defined up to a multiplicative constant. So we propose the default Bayesian hypothesis testing procedures based on the fractional Bayes factor and the intrinsic Bayes factors under the reference priors. Simulation study and an example are provided.
Estimation of the half-logistic distribution based on multiply Type I hybrid censored sample
Shin, Hyejung ; Kim, Jungdae ; Lee, Changsoo ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1581~1589
DOI : 10.7465/jkdi.2014.25.6.1581
In this paper, we consider maximum likelihood estimators of the location and scale parameters for the half-logistic distribution when samples are multiply Type I hybrid censored. The scale parameter is estimated by approximate maximum likelihood estimation methods using two different Taylor series expansion types (
). We compare the estimators in the sense of the root mean square error (RMSE). The simulation procedure is repeated 10,000 times for the sample size n=20 and 40 and various censored schemes. The approximate MLE of the second type is better than that of the first type in the sense of the RMSE. Further an illustrative example with the real data is presented.
Survival analysis of bank loan repayment rate for customers of Hawassa commercial bank of Ethiopaia
Kitabo, Cheru Atsmegiorgis ; Kim, Jong Tae ;
Journal of the Korean Data and Information Science Society, volume 25, issue 6, 2014, Pages 1591~1598
DOI : 10.7465/jkdi.2014.25.6.1591
The reviews of the balance sheet of commercial banks showed that loan item constitutes the largest portion of bank's assets. Although the sector has highest rate of profit, it possesses the greatest risk. Identifying factors that can contribute in lifting-up the loan repayment rate of customers of Hawassa district commercial bank is the major goal of this study. A sample of 183 customers who took loan from October, 2005 to April, 2012 was taken from the bank record. Kaplan-Meier estimation method and univariate Cox proportional hazard model were applied to identify factors affecting bank loan repayment rate. The result from Kaplan-Meier survival estimation revealed that the loan repayment rate is significantly related with loan type, and previous loan experience, educational level and mode of repayment. The log-rank test indicates that the survival probability of loan customers is not statistically different in repaying the loan among groups classified by sex. Moreover, the univariate Cox proportional hazard model result portrayed that educational level, having previous loan experience, mode of repayment, collateral type and purpose of loan are significantly related with loan repayment rate of customers commercial bank. Hence, banks should design loan strategies giving special emphasis on the significant factors while they are giving loans to their customers.