Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Journal of the Korean Data and Information Science Society
Journal Basic Information
Journal DOI :
Korean Data and Information Science Society
Editor in Chief :
Volume & Issues
Volume 20, Issue 6 - Nov 2009
Volume 20, Issue 5 - Sep 2009
Volume 20, Issue 4 - Jul 2009
Volume 20, Issue 3 - May 2009
Volume 20, Issue 2 - Mar 2009
Volume 20, Issue 1 - Jan 2009
Selecting the target year
Long term trend for particular matters in Seoul
Park, Hye-Ryun ; Choi, Ki-Heon ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 765~777
Our study aimed to illustrate long term trend in 10 micrometer particular matters excluding confounding effect. Daily 10 micrometer particular matters data were measured in 27 places and meteorological data (maximum temperature, humidity and maximum wind speed, solar radiation) were obtained from the national institute of environmental research for the period from January, 1996 to December 2000. To estimate the increasing and decreasing long term trend in a set of observed data, set up the model. The model included regression spline smooth function on the time and meteorological factors to capture the seasonal time trend and any possible nonlinear relationship. The result was estimated to decrease slightly after adjusting for meteorological factors and seasonal time trend.
Statistical randomness test for Korean lotto game
Lim, Su-Yeol ; Baek, Jang-Sun ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 779~786
Lotto is one of the most popular lottery games in the world. In korea the lotto considers numbers 1, 2,..., 45 from which 6 numbers are drawn randomly, without replacement. The profits from the lotto supports social welfare. However, there has been a suspicion that the choice of the winning numbers might not be random. In this study, we applied the randomness test developed by Coronel-Brizio et al. (2008) to the historical korean lotto data to see if the drawing process is random. The result of our study shows that the process was random during two periods under the management of different business companies and of price changes, respectively.
A case study on the random coefficient model for diet experimental data
Jo, Jin-Nam ; Baik, Jai-Wook ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 787~796
A random coefficient model is applied when times of the repeated measurements are not fixed in experiments with respect to the subjects. The procedures of the inference of a random coefficient model are same as those of a mixed model. Diet experimental data was used for applying the random coefficient model. Various random coefficient models are investigated for the experimental data, and are compared each other. Finally, optimal random coefficient model would be selected. It resulted from the analysis that for the fixed effect factor, the baseline, treatment, height, and time effect were very significant. The treatment effect of the diet foods and exercises were more effective in losing weight than the effect of the diet foods only. The fixed cubic time effect was very significant. The variance components corresponding to the subject effect, linear time effect, quadratic time effect, and cubic time effect of the random coefficients are all positive. When quartic time effect was added as random coefficients the model did not converge. Thus random coefficients up to the cubic terms was considered as the optimal model.
An empirical study on the relationship of speed change and injuries subjected by rear-end collisions
Kang, Sung-Mo ; Kim, Joo-Hwan ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 797~807
In a case of an automobile rear-end collision, scale of the collision which are the extent of vehicle damage and the injury of the passenger is affected by the speed change. Based on the photographic interpretation of the actual accident cases in the Seoul and the Incheon area, this study measured the depth of crush and calculated the speed change from the statement data of the accident and speed, and also injury data such as diagnosis, hospitalization days are collected. The period of hospitalization and diagnostics claimed proves to have no statistical correlation with the depth of vehicle crush and speed change. Based on the statistical analysis from this study and previous foreign studies, we found that there have been 78.1% of personal accidents didn't reach the injury threshold. There should be objective information on the scale of accident accepting the claims-to-be-injured in the future, and application of injury threshold level suggested are considered to be very useful.
A study on neighbor selection methods in k-NN collaborative filtering recommender system
Lee, Seok-Jun ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 809~818
Collaborative filtering approach predicts the preference of active user about specific items transacted on the e-commerce by using others' preference information. To improve the prediction accuracy through collaborative filtering approach, it must be needed to gain enough preference information of users' for predicting preference. But, a bit much information of users' preference might wrongly affect on prediction accuracy, and also too small information of users' preference might make bad effect on the prediction accuracy. This research suggests the method, which decides suitable numbers of neighbor users for applying collaborative filtering algorithm, improved by existing k nearest neighbors selection methods. The result of this research provides useful methods for improving the prediction accuracy and also refines exploratory data analysis approach for deciding appropriate numbers of nearest neighbors.
Categorical data analysis of sensory evaluation data with Hanwoo bull beef
Lee, Hye-Jung ; Cho, Soo-Hyun ; Kim, Jae-Hee ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 819~827
This study was conducted to investigate the relationship between the sociodemographic factors and the Korean consumers palatability evaluation grades with Hanwoo sensory evaluation data. The dichotomy logistic regression model and the multinomial logistic regression model are fitted with the independent variables such as the consumer living location, age, gender, occupation, monthly income, and beef cut and the the palatability grade as the dependent variable. Stepwise variable selection procedure is incorporated to find the final model and odds ratios are calculated to find the associations between categories.
Optimization procedure for parameter design using neural network
Na, Myung-Whan ; Kwon, Yong-Man ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 829~835
Parameter design is an approach to reducing performance variation of quality characteristic value in products and processes. Taguchi has used the signal-to-noise ratio to achieve the appropriate set of operating conditions where variability around target is low in the Taguchi parameter design. However, there are difficulties in practical application, such as complexity and nonlinear relationships among quality characteristics and control factors (design factors), and interactions occurred among control factors. Neural networks have a learning capability and model free characteristics. There characteristics support neural networks as a competitive tool in processing multivariable input-output implementation. In this paper we propose a substantially simpler optimization procedure for parameter design using neural network. An example is illustrated to compare the difference between the Taguchi method and neural network method.
A study on points per game using scored goal per game and lossed goal per game in the union of European football professional league
Shin, Sang-Keun ; Cho, Yong-Ju ; Cho, Young-Seuk ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 837~844
This study used data of soccer match 5170 games from 1950 to 2008 in five European football professional leagues. We compared average of SGPG (scored goal per game) in each two and three points of win. And we compared average of SGPG in each leagues. In order to predict PtsG (points per game), we executed regression analysis using SGPG and LGPG (lossed goal per game). Finally, We applied regression analysis to a K-league.
Rasch analysis to the Copenhagen neck functional disability scale with neck pain subjects
Kim, Tae-Ho ; Gong, Won-Tae ; Park, So-Yeon ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 845~855
The purpose of this study was to examine the category function, the item structure, and the model-data fit of the Copenhagen neck functional disability scale (CNFDS) with neck pain subjects using Rasch rating scale analysis. The data was obtained from the assessments of 71 college students with neck pain. The 'concentration' item showed misfit and fourteen items were founds to be fits for self-reporting of disability due to neck pain. The most difficult item of the remaining 14 items was 'help' and the easiest item was 'social contact'. The subjects and items reliability of separation reliability were 0.85 and 0.97. The CNFDS for self-reporting of disability due to mild neck pain has been proved valid and reliable. This study is suggested that individuals with mild neck pain may be used the modified CNFDS that were not included 'concentration' item and were adjusted the 2 response levels.
The methods of forecasting for the number of student based on promotion proportion
Kim, Jong-Tae ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 857~867
The purpose of this paper is to suggest the methods of forecasting for the number of the elementary, middle and high-school student based on the proportion of promotion until 2026 year. The suggested methods are the proportion of promotion, mov baseverage, Holt-W bters model, SARIMA, regression fit. As the result, the abilities of forecasting by the method of moving average are better than those of other methods.
Statistical algorithm and application for the noise variance estimation
Kim, Yeong-Hwa ; Nam, Ji-Ho ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 869~878
Image restoration techniques such as noise reduction and contrast enhancement have been researched for enhancing a contaminated image by the noise. An image degraded by additive random noise can be enhanced by noise reduction. Sigma filtering is one of the most widely used method to reduce the noise. In this paper, we propose a new sigma filter algorithm based on noise variance estimation which effectively enhances the degraded image by noise. Specifically, the Bartlett test is used to measure the degree of noise with respect to the degree of image feature. Simulation results are also given to show the performance of the proposed algorithm.
Analysis of market share attraction data using LS-SVM
Park, Hye-Jung ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 879~886
The purpose of this article is to present the application of Least Squares Support Vector Machine in analyzing the existing structure of brand. We estimate the parameters of the Market Share Attraction Model using a non-parametric technique for function estimation called Least Squares Support Vector Machine, which allows us to perform even nonlinear regression by constructing a linear regression function in a high dimensional feature space. Estimation by Least Squares Support Vector Machine technique makes it a good candidate for solving the Market Share Attraction Model. To illustrate the performance of the proposed method, we use the car sales data in South Korea's car market.
Minimum risk point estimation of two-stage procedure for mean
Choi, Ki-Heon ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 887~894
The two-stage minimum risk point estimation of mean, the probability of success in a sequence of Bernoulli trials, is considered for the case where loss is taken to be symmetrized relative squared error of estimation, plus a fixed cost per observation. First order asymptotic expansions are obtained for large sample properties of two-stage procedure. Monte Carlo simulation is carried out to obtain the expected sample size that minimizes the risk and to examine its finite sample behavior.
A correction of SE from penalized partial likelihood in frailty models
Ha, Il-Do ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 895~903
The penalized partial likelihood based on restricted maximum likelihood method has been widely used for the inference of frailty models. However, the standard-error estimate for frailty parameter estimator can be downwardly biased. In this paper we show that such underestimation can be corrected by using hierarchical likelihood. In particular, the hierarchical likelihood gives a statistically efficient procedure for various random-effect models including frailty models. The proposed method is illustrated via a numerical example and simulation study. The simulation results demonstrate that the corrected standard-error estimate largely improves such bias.
A kernel machine for estimation of mean and volatility functions
Shim, Joo-Yong ; Park, Hye-Jung ; Hwang, Chang-Ha ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 905~912
We propose a doubly penalized kernel machine (DPKM) which uses heteroscedastic location-scale model as basic model and estimates both mean and volatility functions simultaneously by kernel machines. We also present the model selection method which employs the generalized approximate cross validation techniques for choosing the hyperparameters which affect the performance of DPKM. Artificial examples are provided to indicate the usefulness of DPKM for the mean and volatility functions estimation.
Effects of curvature on leverage in nonlinear regression
Kahng, Myung-Wook ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 913~917
The measures of leverage in linear regression has been extended to nonlinear regression models. We consider several curvature measures of nonlinearity in an estimation situation. The relationship between measures of leverage and statistical curvature are explored in nonlinear regression models. The circumstances under which the Jacobian leverage reduces to a tangent plane leverage are discussed in connection with the effective residual curvature of the nonlinear model.
Correspondence analysis for studying association between geography and cancer
Song, Joon-Jin ; Yu, Pingjian ; Ren, Yuan ; Chung, Ming-Hua ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 919~924
Geographical location carries information such as demography, local economy, environment, and life styles, which could be the sources of cancer occurrence. Analyzing geographical location associated with cancer occurrence can be instructive to physicians, patients, and health administrators regarding resource allocation, expenditures, prophylaxis and treatments. In this paper, we explored the correspondence relationship between geographical locations and mortality rates of the cancers using correspondence analysis and illustrated the approach with the mortality rates of the top 10 cancers in the 75 counties in Arkansas from 2001 to 2005. Geographical variations with respect to the mortality rates of cancers are evaluated across Arkansas counties. Based on the contingency table, correspondence analysis model is developed and the simple indices which indicate the degree to which the regions and the cancers affect each other are calculated. Quantitative results are visualized and mapped in two-dimensional graphs.
Variance function estimation with LS-SVM for replicated data
Shim, Joo-Yong ; Park, Hye-Jung ; Seok, Kyung-Ha ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 925~931
In this paper we propose a variance function estimation method for replicated data based on averages of squared residuals obtained from estimated mean function by the least squares support vector machine. Newton-Raphson method is used to obtain associated parameter vector for the variance function estimation. Furthermore, the cross validation functions are introduced to select the hyper-parameters which affect the performance of the proposed estimation method. Experimental results are then presented which illustrate the performance of the proposed procedure.
Obtaining bootstrap data for the joint distribution of bivariate survival times
Kwon, Se-Hyug ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 933~939
The bivariate data in clinical research fields often has two types of failure times, which are mark variable for the first failure time and the final failure time. This paper showed how to generate bootstrap data to get Bayesian estimation for the joint distribution of bivariate survival times. The observed data was generated by Frank's family and the fake date is simulated with the Gamma prior of survival time. The bootstrap data was obtained by combining the mimic data with the observed data and the simulated fake data from the observed data.
Estimation of the Block and Basu model for system level life testing with censored data
Jeong, In-Ho ; Cho, Kil-Ho ; Cho, Jang-Sik ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 941~948
We consider a life testing experiment in which several two component shared parallel system are put on test, and the test is terminated at a specified number of system failures. The bivariate data obtained from such a system level life testing can be classified into three classes: (1) the case of failed two components with known failure times, (2) the case of one censored component and the other failed component of which the failure time might be known or unknown, (3) the case of censored two components. In this thesis, the maximum likelihood estimators of parameters for Block and Basu bivariate exponential distribution under above censoring scheme are obtained. And the results of comparative studies are presented.
Kernel method for autoregressive data
Shim, Joo-Yong ; Lee, Jang-Taek ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 949~954
The autoregressive process is applied in this paper to kernel regression in order to infer nonlinear models for predicting responses. We propose a kernel method for the autoregressive data which estimates the mean function by kernel machines. We also present the model selection method which employs the cross validation techniques for choosing the hyper-parameters which affect the performance of kernel regression. Artificial and real examples are provided to indicate the usefulness of the proposed method for the estimation of mean function in the presence of autocorrelation between data.
Reliability and ratio in exponentiated complementary power function distribution
Moon, Yeung-Gil ; Lee, Chang-Soo ; Ryu, Se-Gi ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 955~960
As we shall dene an exponentiated complementary power function distribution, we shall consider moments, hazard rate, and inference for parameter in the distribution. And we shall consider an inference of the reliability and distributions for the quotient and the ratio in two independent exponentiated complementary power function random variables.
Estimation for the half triangle distribution based on Type-I hybrid censored samples
Kang, Suk-Bok ; Cho, Young-Seuk ; Han, Jun-Tae ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 961~969
A hybrid censoring is a mixture of Type-I and Type-II censoring schemes. This paper deals with estimation based on Type-I hybrid censored samples from the half triangle distribution. We derive some estimators of the scale parameter of the half triangle distribution based on Type-I hybrid censored samples. We compare the proposed estimators in the sense of the mean squared error for various censored samples.
Fuzzy histogram in estimating loss distributions for operational risk [Author's Correction]
Park, R.J. ;
Journal of the Korean Data and Information Science Society, volume 20, issue 5, 2009, Pages 971~971