Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Korean Journal of Applied Statistics
Journal Basic Information
Journal DOI :
The Korean Statistical Society
Editor in Chief :
Volume & Issues
Volume 21, Issue 6 - Dec 2008
Volume 21, Issue 5 - Oct 2008
Volume 21, Issue 4 - Aug 2008
Volume 21, Issue 3 - Jun 2008
Volume 21, Issue 2 - Apr 2008
Volume 21, Issue 1 - Feb 2008
Selecting the target year
Comparisons of Kruglyak and Lander's Nonparametric Linkage Test and Weighted Regression Incorporating Replications
Choi, Eun-Kyeong ; Song, Hae-Hiang ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 1~17
DOI : 10.5351/KJAS.2008.21.1.001
The ordinary least squares regression method of Haseman and Elston(1972) is most widely used in genetic linkage studies for continuous traits of sib pairs. Kruglyak and Lander(1995) suggested a statistic which appears to be a nonparametric counterpart to the Haseman and Elston(1972)'s regression method, but in fact these two methods are quite different. In this paper the relationships between these two methods are described and will be compared by simulation studies. One of the characteristics of the sib-pair linkage study is that the explanatory variable has only three different values and thus dependent variable is heavily replicated in each value of the explanatory variable. We propose a weighted least squares regression method which is more appropriate to this situation and the efficiency of the weighted regression in genetic linkage study was explored with normal and non-normal simulated continuous traits data. Simulation studies demonstrated that the weighted regression is more powerful than other tests.
Comparison of Principal Component Regression and Nonparametric Multivariate Trend Test for Multivariate Linkage
Kim, Su-Young ; Song, Hae-Hiang ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 19~33
DOI : 10.5351/KJAS.2008.21.1.019
Linear regression method, proposed by Haseman and Elston(1972), for detecting linkage to a quantitative trait of sib pairs is a linkage testing method for a single locus and a single trait. However, multivariate methods for detecting linkage are needed, when information from each of several traits that are affected by the same major gene are available on each individual. Amos et al. (1990) extended the regression method of Haseman and Elston(1972) to incorporate observations of two or more traits by estimating the principal component linear function that results in the strongest correlation between the squared pair differences in the trait measurements and identity by descent at a marker locus. But, it is impossible to control the probability of type I errors with this method at present, since the exact distribution of the statistic that they use is yet unknown. In this paper, we propose a multivariate nonparametric trend test for detecting linkage to multiple traits. We compared with a simulation study the efficiencies of multivariate nonparametric trend test with those of the method developed by Amos et al. (1990) for quantitative traits data. For multivariate nonparametric trend test, the results of the simulation study reveal that the Type I error rates are close to the predetermined significance levels, and have in general high powers.
Methods of Combining P-values for Multiple Endpoints of Various Data Types
Kim, Su-Young ; Song, Hae-Hiang ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 35~51
DOI : 10.5351/KJAS.2008.21.1.035
Comparative studies in Phase III clinical trials quite often involve two or more equally important endpoints, and one cannot select primary endpoint from them. O'Brien(1984) proposed for continuous endpoints the OLS and GLS statistics as milti-variate test statistics. Pocock et al. (1987) mentioned the possibility of analyzing a mixture of data types, such as quantitative, binary and survival data types, with the OLS and GLS statistics, but the authors did not explore problems in combining several endpoints of different types. Furthermore, they did not perform a simulation study to assess the efficiencies of the OLS and GLS statistics for endpoints of a mixture of data types. In this paper, we propose the combining methods of correlated P-values for the analysis of multiple endpoints, and compare the efficiencies of this method with those of OLS and GLS statistics for a mixture of data types with a simulation study. Among the several methods of combining P-values that are more advantageous than combining of OLS and GLS statistics, method B maintains nominal significance levels and is more efficient, while method F and G have type I error rates that are larger than the specified significance levels, which might occasionally lead to a wrong conclusion.
Main SNP Identification of Hanwoo Carcass Weight with Multifactor Dimensionality Reduction(MDR) Method
Lee, Jea-Young ; Kim, Dong-Chul ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 53~63
DOI : 10.5351/KJAS.2008.21.1.053
It is commonly believed that disease of human or economic traits of livestock are caused not by single gene acting alone, but by multiple genes interacting with one an-other. This issue is difficult due to the limitations of parametric statistical method like as logistic regression for detection of gene effects that are dependent solely on interactions with other genes and with environmental exposures. Multifactor dimensionality reduction (MDR) nonparametric statistical method, to improve the identification of single nucleotide polymorphism (SNP) associated with the Hanwoo(Korean cattle) carcass cold weight, is applied and compared with ANOVA results.
Statistical Analysis of a Small Scale Time-Course Microarray Experiment
Lee, Keun-Young ; Yang, Sang-Hwa ; Kim, Byung-Soo ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 65~80
DOI : 10.5351/KJAS.2008.21.1.065
Small scale time-course microarray experiments are those which have a small number of time points. They comprise about 80 percent of all time-course microarray experiments conducted up to 2005. Several statistical methods for the small scale time-course microarray experiments have been proposed. In this paper we applied three methods, namely, QR method, maSigPro method and STEM, to a real time-course microarray experiment which had six time points. We compared the performance of these three methods based on a simulation study and concluded that STEM outperformed, in general, in terms of power when the FDR was set to be 5%.
Transmission and Disequilibrium Tests Based on Sibship Data
Kim, Jin-Heum ; Jang, Yang-Soo ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 81~94
DOI : 10.5351/KJAS.2008.21.1.081
Family-based tests such as the transmission and disequilibrium tests(TDT) have proved to be powerful tools in the search for disease genes. Unlike case-control studies, the tests are not affected by population admixture, which can lead to spurious association of multiple highly linked makers with disease-susceptible genes. Those tests have largely required knowledge of parental marker genotypes. However, parental data are often not available for late-onset diseases. In this article we propose sib-TDTs that overcome this problem by use of marker data from unaffected sib(s) instead of parents. To do this end, we fist defined a Mantel-Haenszel-type statistic for each haplotype and then proposed two tests based on this statistic. Simulation studies suggest that the proposed tests are robust to population admixture and are monotone increasing as a relative risk increases irrespective of mode of inheritance. We also illustrated the proposed tests with data adopted from Yonsei Cardiovascular Genome Center.
An Imputation for Nonresponses in the Survey on the Rural Living Indicators
Cho, Young-Sook ; Chun, Young-Min ; Hwang, Dae-Yong ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 95~107
DOI : 10.5351/KJAS.2008.21.1.095
Survey on the rural living indicators was the statistic approved from National Statistical Office and the survey executed by rural resources development institute. This study was used the raw data of survey on the rural living indicators in 2005. After editing procedure for raw data, we were studied 1,582 households which is acquired through elimination of case included nonresponses, and imputed a nonresponses of 15 item selected from 146 item. The imputation methods and efficiency of imputation for simulation was adapted differently from type of data. For continuous data, we imputed the nonresponses with mean imputation, regression imputation, adjusted grey-based k-NN imputation(DU, DW, WU, WW) and compared the results with RMSE. For categorical data, we imputed the nonresponses with mode method, probability imputation, conditional mode method, conditional probability method, hot-deck imputation, and compared the results with Accuracy. By the results, regression imputation and adjusted grey-based k-NN imputation appropriated for continuous data and hot-deck imputation appropriated for categorical data.
Shrinkage Prediction for Small Area Estimations
Hwang, Hee-Jin ; Shin, Key-Il ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 109~123
DOI : 10.5351/KJAS.2008.21.1.109
Many small area estimation methods have been suggested. Also for the comparison of the estimation methods, model diagnostic checking techniques have been studied. Almost all of the small area estimators were developed by minimizing MSE(Mean square error) and so the MSE is the well-known comparison criterion for superiority. In this paper we suggested a new small area estimator based on minimizing MSPE(Mean square percentage error) which is recently re-highlighted. Also we compared the new suggested estimator with the estimators explained in Shin et al. (2007) using MSE, MSPE and other diagnostic checking criteria.
A Study of the Regional Economic Multiplier Impacts of Local Cultural Festival : In Case of Jeonju International Film Festival
Kim, Yon-Hyong ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 125~140
DOI : 10.5351/KJAS.2008.21.1.125
The purpose of this study is to analysis the economic impacts of regional cultural festival via using the regional input-output model. In order to achieve this purpose, calculate output, value added and employment multiplier impacts of the Jeonju International Film Festival. The impacts of the JIFF on regional economic follow as ; Output is 112 hundred million won, value added is 53 hundred million won and employes is 254 labors. We need a new following strategies to obtain highly positive impacts from regional cultural festival. It needs to made networks among sight-seeing places, cultural remains, restaurants, hotels and entertainment institutions, in order to made visitors and customers expend much.
Variable Selection for Multi-Purpose Multivariate Data Analysis
Huh, Myung-Hoe ; Lim, Yong-Bin ; Lee, Yong-Goo ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 141~149
DOI : 10.5351/KJAS.2008.21.1.141
Recently we frequently analyze multivariate data with quite large number of variables. In such data sets, virtually duplicated variables may exist simultaneously even though they are conceptually distinguishable. Duplicate variables may cause problems such as the distortion of principal axes in principal component analysis and factor analysis and the distortion of the distances between observations, i.e. the input for cluster analysis. Also in supervised learning or regression analysis, duplicated explanatory variables often cause the instability of fitted models. Since real data analyses are aimed often at multiple purposes, it is necessary to reduce the number of variables to a parsimonious level. The aim of this paper is to propose a practical algorithm for selection of a subset of variables from a given set of p input variables, by the criterion of minimum trace of partial variances of unselected variables unexplained by selected variables. The usefulness of proposed method is demonstrated in visualizing the relationship between selected and unselected variables, in building a predictive model with very large number of independent variables, and in reducing the number of variables and purging/merging categories in categorical data.
Mixture Model with Survey and a Statistical Model
Kim, Youn-Jong ; Kim, Yong-Chul ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 151~157
DOI : 10.5351/KJAS.2008.21.1.151
In generally, we use a method of the survey to forecast the economic demands and non-economic demands for a market trend. But we have a difficult problem to estimate the demand for the marketplace objectively because the survey with the non-response and the inadequate understanding on questionnaires did not provided the strong and uniform forecast. Here, we proposed a method compounded of survey and a statistical model to estimate the demand for the marketplace and discussed the mixture model applied to the service demand on an agency.
Estimating the Term Structure of Interest Rates Using Mixture of Weighted Least Squares Support Vector Machines
Nau, Sung-Kyun ; Shim, Joo-Yong ; Hwang, Chang-Ha ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 159~168
DOI : 10.5351/KJAS.2008.21.1.159
Since the term structure of interest rates (TSIR) has longitudinal data, we should consider as input variables both time left to maturity and time simultaneously to get a more useful and more efficient function estimation. However, since the resulting data set becomes very large, we need to develop a fast and reliable estimation method for large data set. Furthermore, it tends to overestimate TSIR because data are correlated. To solve these problems we propose a mixture of weighted least squares support vector machines. We recognize that the estimate is well smoothed and well explains effects of the third stock market crash in USA through applying the proposed method to the US Treasury bonds data.
Three Dimensional CERES Plot in Generalized Linear Models
Kahng, Myung-Wook ; Kim, Bu-Yong ; Jeon, Jin-Young ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 169~176
DOI : 10.5351/KJAS.2008.21.1.169
We explore the structure and usefulness of three dimensional CERES plot as a basic tool for dealing with curvature as a function of the new predictors in generalized linear models. If predictors have nonlinear effects and there are nonlinear relationships among the predictors, the partial residual plot is not able to display the correct functional form of the predictors. Unlike this plots, the CERES plot can show the correct form. This is illustrated by simulated data.
Calculating Sample Variance for the Combined Data
Shin, Mi-Young ; Cho, Tae-Kyoung ;
Korean Journal of Applied Statistics, volume 21, issue 1, 2008, Pages 177~182
DOI : 10.5351/KJAS.2008.21.1.177
There are times when we need more sample to achieve a more accurate estimator. Since these two sets of sample have the information about the same population, it is necessary to treat both as a single combined data. In this paper we present the unpooled sample variance for the combined data when we just know a sample mean and variance for the each data set without the raw data. It is shown that the pooled variance
is always greater than the exact variance
. And the difference of means for two data,
, is larger, the difference of