Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Journal of the Korean Data and Information Science Society
Journal Basic Information
Journal DOI :
Korean Data and Information Science Society
Editor in Chief :
Volume & Issues
Volume 24, Issue 6 - Nov 2013
Volume 24, Issue 5 - Sep 2013
Volume 24, Issue 4 - Jul 2013
Volume 24, Issue 3 - May 2013
Volume 24, Issue 2 - Mar 2013
Volume 24, Issue 1 - Jan 2013
Selecting the target year
Ruin probabilities in a risk process perturbed by diffusion with two types of claims
Won, Ho Jeong ; Choi, Seung Kyoung ; Lee, Eui Yong ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 1~12
DOI : 10.7465/jkdi.2013.24.1.1
In this paper, we introduce a continuous-time risk model where the surplus follows a diffusion process with positive drift while being subject to two types of claims. We assume that the sizes of both types of claims are exponentially distributed and that type I claims occur more frequently, however, their sizes are smaller than type II claims. We obtain the ruin probability that the level of the surplus becomes negative, by establishing an integro-differential equation for the ruin probability. We also obtain the ruin probabilities caused by each type of claim and the probability that the level of the surplus becomes negative naturally due to the diffusion process. Finally, we illustrate a numerical example to compare the impacts of two types of claim on the ruin probability of the surplus with that of the diffusion process in the risk model.
Self-diagnostic system for smartphone addiction using multiclass SVM
Pi, Su Young ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 13~22
DOI : 10.7465/jkdi.2013.24.1.13
Smartphone addiction has become more serious than internet addiction since people can download and run numerous applications with smartphones even without internet connection. However, smartphone addiction is not sufficiently dealt with in current studies. The S-scale method developed by Korea National Information Society Agency involves so many questions that respondents are likely to avoid the diagnosis itself. Moreover, since S-scale is determined by the total score of responded items without taking into account of demographic variables, it is difficult to get an accurate result. Therefore, in this paper, we have extracted important factors from all data, which affect smartphone addiction, including demographic variables. Then we classified the selected items with a neural network. The result of a comparative analysis with backpropagation learning algorithm and multiclass support vector machine shows that learning rate is slightly higher in multiclass SVM. Since multiclass SVM suggested in this paper is highly adaptable to rapid changes of data, we expect that it will lead to a more accurate self-diagnosis of smartphone addiction.
A study on comparing response times between Wibro and wired internet using portals
Ryu, Gui-Yeol ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 23~32
DOI : 10.7465/jkdi.2013.24.1.23
Regression diagnostics for response transformations in a partial linear model
Seo, Han Son ; Yoon, Min ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 33~39
DOI : 10.7465/jkdi.2013.24.1.33
In the transformation of response variable in partial linear models outliers can cause a bad effect on estimating the transformation parameter, just as in the linear models. To solve this problem the processes of estimating transformation parameter and detecting outliers are needed, but have difficulties to be performed due to the arbitrariness of the nonparametric function included in the partial linear model. In this study, through the estimation of nonparametric function and outlier detection methods such as a sequential test and a maximum trimmed likelihood estimation, processes for transforming response variable robust to outliers in partial linear models are suggested. The proposed methods are verified and compared their effectiveness by simulation study and examples.
Study on the K-scale reflecting the confidence of survey responses
Park, Hye Jung ; Pi, Su Young ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 41~51
DOI : 10.7465/jkdi.2013.24.1.41
In the Information age, internet addiction has been a big issue in a modern society. The adverse effects of the internet addiction have been increasing at an exponential speed. Along with a great variety of internet-connected device supplies, K-scale diagnostic criteria have been used for the internet addiction self-diagnose tests in the high-speed wireless Internet service, netbooks, and smart phones, etc. The K-scale diagnostic criteria needed to be changed to meet the changing times, and the diagnostic criteria of K-scale was changed in March, 2012. In this paper, we analyze the internet addiction and K-scale features on the actual condition of Gyeongbuk collegiate areas using the revised K-scale diagnostic criteria in 2012. The diagnostic method on internet addiction is measured by the respondents' subjective estimation. Willful error of the respondents can be occurred to hide their truth. In this paper, we add the survey response to the trusted reliability values to reduce response errors on the K-scale on the K-scale, and enhance the reliability of the analysis.
A study on the density analysis of climatological stations using the correlation integral method in the fractal dimension
Kim, Hee-Kyung ; Lee, Yung-Seop ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 53~62
DOI : 10.7465/jkdi.2013.24.1.53
Currently we have 11 climatological stations registered in World Meteorological Organization. Geographically, these stations are unevenly distributed in Korea and are mainly located on seaside. Therefore station's density analysis should be performed to produce the high-quality climatological data. Using the correlation integral method, the density of climatological stations can be measured by the estimation of fractal dimension. In this study, new climatological stations having the higher fractal dimension were selected. Sequential or simultaneous selection method were carried out until 3 new stations were selected based on the fractal dimension.
Study on validity verification of Korean version of DELES and its relationship with perceived learning achievement and cyber education satisfaction
Kim, Jungjoo ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 63~72
DOI : 10.7465/jkdi.2013.24.1.63
This study it to verify the validity of Korean version of DELES (distance education learning environment survey) and analyze its relationship with learning achievement and distance education satisfaction. The target population of this study is students of K cyber university and a total of 254 cases are used for the analysis. Exploratory and confirmatory factor analysis is applied to verify 6 factors of DELES and structural equation analysis is applied to examine the relationship between distance education learning environment and learning achievement and distance education satisfaction. The study result shows that DELES is composed of six factors such as instructor support, student interaction & collaboration, personal relevance, authentic learning, active learning and student autonomy and its model fits are appropriate. The result of structural equation analysis shows distance education learning environment significantly influences distance education satisfaction directly as well as indirectly mediated by learning achievement. Learning achievement also significantly influences distance education satisfaction. Conclusions and implications are followed.
Class homogeneous tests with correlation
Hong, Chong Sun ; Lee, Na Young ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 73~83
DOI : 10.7465/jkdi.2013.24.1.73
Among class quantitative tests for the credit rating systems, the credit rating tests for calibration are to test the class homogeneous differences between observed and predicted probabilities. For one time period, binomial test and chi-square test are included, and normal test and extended traffic lights test are also contained for several time peroids. In this work, we consider real data in which there exists correlation among variables, so that these test methods could be applied to the credit rating systems as well as various kinds of the class data such as BWT data and FSI data.
Building credit scoring models with various types of target variables
Woo, Hyun Seok ; Lee, Seok Hyung ; Cho, HyungJun ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 85~94
DOI : 10.7465/jkdi.2013.24.1.85
As the financial market becomes larger, the loss increases due to the failure of the credit risk managements from the poor management of the customer information or poor decision-making. Thus, the credit risk management also becomes more important and it is essential to develop a credit scoring model, which is a fundamental tool used to minimize the credit risk. Credit scoring models have been studied and developed only for binary target variables. In this paper, we consider other types of target variables such as ordinal multinomial data or longitudinal binary data and suggest credit scoring models. We then apply our developed models to real data and random data, and investigate their performance through Kolmogorov-Smirnov statistic.
Nonparametric procedures using aligned method and joint placement in randomized block design
Jo, Sungdong ; Kim, Dongjae ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 95~103
DOI : 10.7465/jkdi.2013.24.1.95
Nonparametric procedure in randomized block design (RBD) was proposed by Friedman (1937) for general alternatives. Also Page (1963) suggested the test for ordered alternatives in RBD. In this paper, we proposed the new nonparametric method in randomized block design using aligned method suggested by Hodges and Lehmann (1962) and the joint placement described in Chung and Kim (2007). Also, Monte Carlo simulation study was adapted to compare the power of the proposed procedure with those of previous procedure.
Characteristic analysis for moving in and moving out of departments - Focused on the D university example -
Choi, Seungbae ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 105~115
DOI : 10.7465/jkdi.2013.24.1.105
As far as the universities in south Korea are concerned, they have to meet the need of the situation as the number of the incoming students are decreasing because of the population-reducing in south Korea. The Ministry of Education Science and Technology is enforcing the restructuring of an universities by evaluating all the universities in Korea by using some indices (employment rate, supplement rate of students etc.). Most of the universities in Korea are widely permitting the changes of the major study as a method to improve the 'supplement rate of students' among some measures. These changes of major study (moving in and moving out) can give rise to difficulties in managing an university because there might be the departments with a small number of students as they moving out from low level departments to high level ones. Moreover, as raising the change rate of the major study, there is no loss from the university's point of view but a department could be in a difficult situation. The purpose of this study is to grasp the characteristics for changing major study by a general statistical analysis and graphs produced by a social network analysis with the D university's case. The results of this study are as follows; (a) category is from the engineering to humanity-society, (b) entrance level is from low to high, and (c) employment rate is from low to high as well.
Utilization of similarity measures by PIM with AMP as association rule thresholds
Park, Hee Chang ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 117~124
DOI : 10.7465/jkdi.2013.24.1.117
Association rule of data mining techniques is the method to quantify the relationship between a set of items in a huge database, andhas been applied in various fields like internet shopping mall, healthcare, insurance, and education. There are three primary interestingness measures for association rule, support and confidence and lift. Confidence is the most important measure of these measures, and we generate some association rules using confidence. But it is an asymmetric measure and has only positive value. So we can face with difficult problems in generation of association rules. In this paper we apply the similarity measures by probabilistic interestingness measure (PIM) with all marginal proportions (AMP) to solve this problem. The comparative studies with support, confidences, lift, chi-square statistics, and some similarity measures by PIM with AMPare shown by numerical example. As the result, we knew that the similarity measures by PIM with AMP could be seen the degree of association same as confidence. And we could confirm the direction of association because they had the sign of their values, and select the best similarity measure by PIM with AMP.
Penalized logistic regression models for determining the discharge of dyspnea patients
Park, Cheolyong ; Kye, Myo Jin ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 125~133
DOI : 10.7465/jkdi.2013.24.1.125
In this paper, penalized binary logistic regression models are employed as statistical models for determining the discharge of 668 patients with a chief complaint of dyspnea based on 11 blood tests results. Specifically, the ridge model based on
penalty and the Lasso model based on
penalty are considered in this paper. In the comparison of prediction accuracy, our models are compared with the logistic regression models with all 11 explanatory variables and the selected variables by variable selection method. The results show that the prediction accuracy of the ridge logistic regression model is the best among 4 models based on 10-fold cross-validation.
Using rough set to develop a volatility reverting strategy in options market
Kang, Young Joong ; Oh, Kyong Joo ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 135~150
DOI : 10.7465/jkdi.2013.24.1.135
This study proposes a novel option strategy by using characteristic of volatility reversion and rough set algorithm in options market. Until now, various research has been conducted on stock and future markets, but minimal research has been done in options market. Particularly, research on the option trading strategy using high frequency data is limited. This study consists of two purposes. The first is to enjoy a profit using volatility reversion model when volatility gap is occurred. The second is to pursue a more stable profit by filtering inaccurate entry point through rough set algorithm. Since options market is affected by various elements like underlying assets, volatility and interest rate, the point of this study is to hedge elements except volatility and enjoy the profit following the volatility gap.
Major gene interactions effect identification on the quality of Hanwoo by radial graph
Lee, Jea-Young ; Bae, Jae-Young ; Lee, Jin-Mok ; Oh, Dong-Yep ; Lee, Seong-Won ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 151~159
DOI : 10.7465/jkdi.2013.24.1.151
It is well known that disease of human and economic traits of livestock are affected a lot by gene combination effect rather than a single gene effect. But existing methods have disadvantages such as heavy computing, many expenses and long time. In order to overcome those drawbacks, SNPHarvester was developed to find the main gene combinations among the many genes. In this paper, we used the superior gene combination which are related to the quality of the Korean beef cattle among sets of SNPs by SNPHarvester, and identified the superior genotypes using radial graph which can enhance various qualities of Korean beef among selected SNP combinations.
The study of foreign exchange trading revenue model using decision tree and gradient boosting
Jung, Ji Hyeon ; Min, Dae Kee ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 161~170
DOI : 10.7465/jkdi.2013.24.1.161
The FX (Foreign Exchange) is a form of exchange for the global decentralized trading of international currencies. The simple sense of Forex is simultaneous purchase and sale of the currency or the exchange of one country's currency for other countries'. We can find the consistent rules of trading by comparing the gradient boosting method and the decision trees methods. Methods such as time series analysis used for the prediction of financial markets have advantage of the long-term forecasting model. On the other hand, it is difficult to reflect the rapidly changing price fluctuations in the short term. Therefore, in this study, gradient boosting method and decision tree method are applied to analyze the short-term data in order to make the rules for the revenue structure of the FX market and evaluated the stability and the prediction of the model.
Noninformative priors for the shape parameter in the generalized Pareto distribution
Kang, Sang Gil ; Kim, Dal Ho ; Lee, Woo Dong ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 171~178
DOI : 10.7465/jkdi.2013.24.1.171
In this paper, we develop noninformative priors for the generalized Pareto distribution when the parameter of interest is the shape parameter. We developed the first order and the second order matching priors.We revealed that the second order matching prior does not exist. It turns out that the reference prior satisfies a first order matching criterion, but Jeffrey's prior is not a first order matching prior. Some simulation study is performed and a real example is given.
Accuracy of linear approximation for fitted values in nonlinear regression
Kahng, Myung-Wook ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 179~187
DOI : 10.7465/jkdi.2013.24.1.179
Bates and Watts (1981) have discussed the problems of reparameterizing nonlinear models in obtaining accurate linear approximation confidence regions for the parameters. A similar problem exists with computing confidence curves for fitted values or predictions. The statistical behavior of fitted values does not depend on the parameterization. Thus, as long as the intrinsic curvature is small, standard Wald intervals for fitted values are likely to be sufficient. Accuracy of linear approximation for fitted values is investigated using confidence curves.
Conditional bootstrap confidence intervals for classification error rate when a block of observations is missing
Chung, Hie-Choon ; Han, Chien-Pai ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 189~200
DOI : 10.7465/jkdi.2013.24.1.189
In this paper, it will be assumed that there are two distinct populations which are multivariate normal with equal covariance matrix. We also assume that the two populations are equally likely and the costs of misclassification are equal. The classification rule depends on the situation whether the training samples include missing values or not. We consider the conditional bootstrap confidence intervals for classification error rate when a block of observation is missing.
Variable selection in censored kernel regression
Choi, Kook-Lyeol ; Shim, Jooyong ;
Journal of the Korean Data and Information Science Society, volume 24, issue 1, 2013, Pages 201~209
DOI : 10.7465/jkdi.2013.24.1.201
For censored regression, it is often the case that some input variables are not important, while some input variables are more important than others. We propose a novel algorithm for selecting such important input variables for censored kernel regression, which is based on the penalized regression with the weighted quadratic loss function for the censored data, where the weight is computed from the empirical survival function of the censoring variable. We employ the weighted version of ANOVA decomposition kernels to choose optimal subset of important input variables. Experimental results are then presented which indicate the performance of the proposed variable selection method.