Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Journal of the Korean Data and Information Science Society
Journal Basic Information
Journal DOI :
Korean Data and Information Science Society
Editor in Chief :
Volume & Issues
Volume 24, Issue 6 - Nov 2013
Volume 24, Issue 5 - Sep 2013
Volume 24, Issue 4 - Jul 2013
Volume 24, Issue 3 - May 2013
Volume 24, Issue 2 - Mar 2013
Volume 24, Issue 1 - Jan 2013
Selecting the target year
Nonparametric estimation of conditional quantile with censored data
Kim, Eun-Young ; Choi, Hyemi ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 211~222
DOI : 10.7465/jkdi.2013.24.2.211
We consider the problem of nonparametrically estimating the conditional quantile function from censored data and propose new estimators here. They are based on local logistic regression technique of Lee et al. (2006) and "double-kernel" technique of Yu and Jones (1998) respectively, which are modified versions under random censoring. We compare those with two existing estimators based on a local linear fits using the check function approach. The comparison is done by a simulation study.
An intelligent early warning system for forecasting abnormal investment trends of foreign investors
Oh, Kyong Joo ; Kim, Young Min ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 223~233
DOI : 10.7465/jkdi.2013.24.2.223
At local emerging stock markets such as Korea, Hong Kong, Singapore and Taiwan, foreign investors (FI) are recognized as important investment community due to the globalization and deregulation of financial markets. Therefore, it is required to monitor the behavior of FI against a sudden enormous selling stocks for the concerned local governments or private and institutional investors. The main aim of this study is to propose an early warning system (EWS) which purposes issuing a warning signal against the possible massive selling stocks of FI at the market. For this, we suggest machine learning algorithm which predicts the behavior of FI by forecasting future conditions. This study is empirically done for the Korean stock market.
Using correlated volume index to support investment strategies in Kospi200 future market
Cho, Seong-Hyun ; Oh, Kyong Joo ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 235~244
DOI : 10.7465/jkdi.2013.24.2.235
In this study, we propose a new trading strategy by using a trading volume index in KOSPI200 futures market. Many studies have been conducted with respect to the relationship between volume and price, but none of them is clearly concluded. This study analyzes the economic usefulness of investment strategy, using volume index. This analysis shows that the trading volume is a preceding index. This paper contains two objectives. The first objective is to make an index using Correlated Volume Index (CVI) and second objective is to find an appropriate timing to buy or sell the Kospi200 future index. The results of this study proved the importance of the proposed model in KOSPI200 futures market, and it will help many investors to make the right investment decision.
A study on proposing a method for grouping R, F, and M in RFM model
Ryu, Gui-Yeol ; Moon, Young-Soo ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 245~255
DOI : 10.7465/jkdi.2013.24.2.245
The object of study is to propose a method for grouping R, F, and M in RFM model. Our model uses 6 levels using standard normal distribution. First level is upper 2.5% and second level next 13.5%, third level next 34%, fourth level next 34%, fifth level next 13.5%, sixth level next 2.5%. Values are symmetric and limits are clear. We compare proposed model with traditional 5 level model and 10 level model using NDSL data of KISTI. Proposed model divides most clearly the distribution of the RFM function for all cases of weights, because it uses the distribution of customers. Comparison studies of our model with grouping using cluster analysis and studies on weights of RFM model are needed.
Evaluation on validity of health literacy measurement scale
Choi, Kyounh-Ho ; Lee, Jeong-Ok ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 257~265
DOI : 10.7465/jkdi.2013.24.2.257
As evaluating the health literacy is getting important, various measures for evaluation are being developed. Nevertheless, discussions about developing proper measures in Korean are still inactive. Therefore in this paper, we proposed Korean REALM (rapid estimate of adult literacy in medicine) measure that is composed of five point scale and investigated about its validity. As a result, we could find that Korean REALM measure which is composed of five point scale has high reliability, and that it formed one dimension as a result of factor analysis. Positive responses were lower than two point scale and correlation coefficient with NVS (the newest vital sign) appeared statistically significant. Therefore, we could conclude that Korean REALM measure that is composed of five point scale is a valid measurement. Furthermore, there were statistically significant differences between general students and department of nursing students about health literacy.
Non-linear regression model considering all association thresholds for decision of association rule numbers
Park, Hee Chang ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 267~275
DOI : 10.7465/jkdi.2013.24.2.267
Among data mining techniques, the association rule is the most recently developed technique, and it finds the relevance between two items in a large database. And it is directly applied in the field because it clearly quantifies the relationship between two or more items. When we determine whether an association rule is meaningful, we utilize interestingness measures such as support, confidence, and lift. Interestingness measures are meaningful in that it shows the causes for pruning uninteresting rules statistically or logically. But the criteria of these measures are chosen by experiences, and the number of useful rules is hard to estimate. If too many rules are generated, we cannot effectively extract the useful rules.In this paper, we designed a variety of non-linear regression equations considering all association thresholds between the number of rules and three interestingness measures. And then we diagnosed multi-collinearity and autocorrelation problems, and used analysis of variance results and adjusted coefficients of determination for the best model through numerical experiments.
Power and major gene-gene identification of dummy multifactor dimensionality reduction algorithm
Yeo, Jungsou ; La, Boomi ; Lee, Ho-Guen ; Lee, Seong-Won ; Lee, Jea-Young ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 277~287
DOI : 10.7465/jkdi.2013.24.2.277
It is important to detect the gene-gene interaction in GWAS (genome-wide association study). There have been many studies on detecting gene-gene interaction. The one is D-MDR (dummy multifoactor dimensionality reduction) method. The goal of this study is to evaluate the power of D-MDR for identifying gene-gene interaction by simulation. Also we applied the method on the identify interaction effects of single nucleotide polymorphisms (SNPs) responsible for economic traits in a Korean cattle population (real data).
Politics behavior data analysis using the adaptive Neyman test
Kim, Myo Jeong ; Hahn, Kyu S. ; Lim, Johan ; Lee, Kyeong Eun ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 289~301
DOI : 10.7465/jkdi.2013.24.2.289
We analyze respondents' reaction to Obama's advertisement, titled 'Fix the Economy'. These respondents are divided into three groups of democratic party, republican party and independent group. By manipulating the skin complexion of the Obama photo, participants were either exposed to the dark or light version of the Obama photograph. In order to obtain decorrelated stationary data, we have applied the discrete Fourier transform to each curve and then we have applied Fan (1998)'s adaptive Neyman test to the discrete Fourier transformed data. As a result, a significant difference is found out only in the independent group.
A statistical analysis of the fat mass repeated measures data using mixed model
Jo, Jinnam ; Chang, Un Jae ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 303~310
DOI : 10.7465/jkdi.2013.24.2.303
Forty two female students whose fat mass ratio was over 30% were participated in the experiment of fat mass loss of two treatments for 8 weeks. They kept diary for foods they ate every day, took a picture of the foods, transmitted the picture to the experimenter by the camera phone. Among those, 28 students took the picture by regular camera phone (Treatment A), and the other students used smart phone (Treatment B). Fat mass weight and its related variables had been measured repeatedly four times at an interval of two weeks during 8 weeks. It was shown from mixed model analysis of repeated measurements data that AR(1) covariance matrix was selected as the optimal covariance matrix pattern. The correlation between two successive times is highly correlated as 0.838. Based upon the AR(1) covariance matrix structure, the students using smart phones were somewhat more effective in losing fat mass weight than the students using regular camera phones. The time effect was highly significant, but the treatment-time interaction effect was insignificant. The baseline effect and total cholesterol were found to be significant, but the calories with taking foods were somewhat significant, but the waist to hip ratio was found to be insignificant.
The model of the weighted proportion estimation for forecasting the number of population
Yoon, Yong Hwa ; Kim, Jong Tae ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 311~320
DOI : 10.7465/jkdi.2013.24.2.311
The purpose of this paper is to suggest the methods of forecasting the numbers of students. The generalized weighted proportion estimation models are suggested and used for forecasting the numbers of student until 2029. The results of the Monte Carlo simulation show that the suggested method is powerful for the forecasting. In conclusion, the numbers of the third grade high-school students will be less than the numbers of college admission quota from 2019.
Time series regression model for forecasting the number of elementary school teachers
Ryu, Soo Rack ; Kim, Jong Tae ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 321~332
DOI : 10.7465/jkdi.2013.24.2.321
Because of the continuous low birthrates, the number of the elementary students will decrease by 17% in 2020 compared to 2011. The purpose of this study is to forecast the number of elementary school teachers until 2020. We used the data in education statistical year books from 1970 to 2010. We used the time-series regression model, time series grouped regression model and exponential smoothing model to predict the number of teachers for the next ten years. Consequently time-series grouped regression model is a better model for forecasting the number of elementary school teachers than other models.
Estimable functions of less than full rank linear model
Choi, Jaesung ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 333~339
DOI : 10.7465/jkdi.2013.24.2.333
This paper discusses a method for getting a basis set of estimable functions of less than full rank linear model. Since model parameters are not estimable estimable functions should be identified for making inferences proper about them. So, it suggests a method of using full rank factorization of model matrix to find estimable functions in easy way. Although they might be obtained in many different ways of using model matrix, the suggested full rank factorization technique could be one of much easier methods. It also discusses how to use projection matrix to identify estimable functions.
A study on semi-supervised kernel ridge regression estimation
Seok, Kyungha ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 341~353
DOI : 10.7465/jkdi.2013.24.2.341
In many practical machine learning and data mining applications, unlabeled data are inexpensive and easy to obtain. Semi-supervised learning try to use such data to improve prediction performance. In this paper, a semi-supervised regression method, semi-supervised kernel ridge regression estimation, is proposed on the basis of kernel ridge regression model. The proposed method does not require a pilot estimation of the label of the unlabeled data. This means that the proposed method has good advantages including less number of parameters, easy computing and good generalization ability. Experiments show that the proposed method can effectively utilize unlabeled data to improve regression estimation.
Importance of sport emotional intelligence on sports psychological skills and sports emotion among athletes
Lee, Mi Sook ; Park, Cheolyong ; Nam, Jung Hoon ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 355~368
DOI : 10.7465/jkdi.2013.24.2.355
The purpose of this study was to verify the relationship among sport emotional intelligence, sports psychological skills and sports emotion of university athletes. To comply with the purpose, the construct validity and the reliability of measured data were verified by using of SPSS 18.0 & AMOS 18.0 program. In addition, for the relationship among sport emotional intelligence, psychological emotion and sports emotion, the difference between sport psychological skills and sport emotion according to the level of sport emotional intelligence were analyzed by latent means analysis with AMOS 18.0 program, and the relationships among the related factors were analyzed by covariance structure analysis. The results were as follows. First, for the difference between sport psychological skills and sport emotion according to the level of sport emotional intelligence, the harmony of teams, mental state and willpower of sport psychological skills on high groups of sport emotional intelligence were shown high compared to those of low groups', while the pride and happiness on high groups of sport emotion were shown high compared to those of low groups'. Second, the sport emotional intelligence had positive effect on sport psychological skills. Third, the sport emotional intelligence had positive effect on sport emotion. Fourth, sport psychological skills had positive effect on sport emotion.
An exploration of tour skill factors influential to game results of LPGA players
Son, Seung Bum ; Lee, Chang Jin ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 369~377
DOI : 10.7465/jkdi.2013.24.2.369
The purpose of this study was to explore which factors mostly influenced players' tour results employing tour skill factors provided by LPGA. For this study, Top 10 LPGA players' stats during 9 years (2004 2012) were used. As matter of fact, 10 variables were used like average score, top 10 finish, average putt, average birdies, average eagles, driving distance, driving accuracy, greens in regulation, sand saves, putts per GIR. and prize money earning. Stepwise multiple regression was conducted using SPSS win 20.0. Results indicated that the most influential tour skill factor to 9 seasons' results was average score, second influential factor was average putt, and the third factor was driving distance, and then top 10 finish was the fourth. Also on a year on year basis, 2004 was average score, 2005 was GIR., 2006 was average eagles, 2007 was top 10 finish, 2008 was average score, 2009 was average putt, 2010 were average score, GIR. and putt per GIR, 2011 was average birdies and 2012 was putt per GIR.
Study of university students' perceptions on participation in elections via structural equation model - Focusing on K university students
Choi, Hyun Seok ; Kwon, Yunji ; Ha, Jeongcheol ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 379~390
DOI : 10.7465/jkdi.2013.24.2.379
Through the survey of the K university students' perception on participation in elections, we want to find ways to induce a sound political participation and to effectively be able to boost voter turnout. We analyze the relations among image on election, interest in election and will of participation via structural equation model. We found that both image on election and interest in election significantly influence on will of participation in election. Using the last election participation as a moderating variable, we found that image on election has more effects on will of participation for the participants but not for the case of interest in election.
GACV for partially linear support vector regression
Shim, Jooyong ; Seok, Kyungha ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 391~399
DOI : 10.7465/jkdi.2013.24.2.391
Partially linear regression is capable of providing more complete description of the linear and nonlinear relationships among random variables. In support vector regression (SVR) the hyper-parameters are known to affect the performance of regression. In this paper we propose an iterative reweighted least squares (IRWLS) procedure to solve the quadratic problem of partially linear support vector regression with a modified loss function, which enables us to use the generalized approximate cross validation function to select the hyper-parameters. Experimental results are then presented which illustrate the performance of the partially linear SVR using IRWLS procedure.
Soil moisture prediction using a support vector regression
Lee, Danhyang ; Kim, Gwangseob ; Lee, Kyeong Eun ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 401~408
DOI : 10.7465/jkdi.2013.24.2.401
Soil moisture is a very important variable in various area of hydrological processes. We predict the soil moisture using a support vector regression. The model is trained and tested using the soil moisture data observed in five sites in the Yongdam dam basin. With respect to soil moisture data of of four sites-Jucheon, Bugui, Sangieon and Ahncheon which are used to train the model, the correlation coefficient between the esimtates and the observed values is about 0.976. As the result of the application to Cheoncheon2 for validating the model, the correlation coefficient between the estimates and the observed values of soil moisture is about 0.835. We compare those results with those of artificial neural network models.
Erratum to "Major gene interactions effect identification on the quality of Hanwoo by radial graph"
Lee, Jea-Young ; Bae, Jae-Young ; Lee, Jin-Mok ; Oh, Dong-Yep ; Lee, Seong-Won ;
Journal of the Korean Data and Information Science Society, volume 24, issue 2, 2013, Pages 409~409
DOI : 10.7465/jkdi.2013.24.2.409