Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Communications for Statistical Applications and Methods
Journal Basic Information
Journal DOI :
The Korean Statistical Society
Editor in Chief :
Volume & Issues
Volume 12, Issue 3 - Dec 2005
Volume 12, Issue 2 - Aug 2005
Volume 12, Issue 1 - Apr 2005
Selecting the target year
Multivariate EWMA Control Charts for Monitoring Dispersion Matrix
Chang Duk-Joon ; Lee Jae Man ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 265~273
DOI : 10.5351/CKSS.2005.12.2.265
In this paper, we proposed multivariate EWMA control charts for both combine-accumulate and accumulate-combine approaches to monitor dispersion matrix of multiple quality variables. Numerical performance of the proposed charts are evaluated in terms of average run length(ARL). The performances show that small smoothing constants with accumulate-combine approach is preferred for detecting small shifts of the production process.
Convergence in Probability for Weighted Sums of Fuzzy Random Variables
Joo, Sang-Yeol ; Hyun, Young-Nam ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 275~283
DOI : 10.5351/CKSS.2005.12.2.275
In this paper, we give a sufficient condition for convergence in probability of weighted sums of convex-compactly uniformly integrable fuzzy random variables. As a result, we obtain weak law of large numbers for weighted sums of convexly tight fuzzy random variables.
Comparison of Parameter Estimation Methods in A Kappa Distribution
Park Jeong-Soo ; Hwang Young-A ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 285~294
DOI : 10.5351/CKSS.2005.12.2.285
This paper deals with the comparison of parameter estimation methods in a 3-parameter Kappa distribution which is sometimes used in flood frequency analysis. Method of moment estimation(MME), L-moment estimation(L-ME), and maximum likelihood estimation(MLE) are applied to estimate three parameters. The performance of these methods are compared by Monte-carlo simulations. Especially for computing MME and L-ME, three dimensional nonlinear equations are simplified to one dimensional equation which is calculated by the Newton-Raphson iteration under constraint. Based on the criterion of the mean squared error, L-ME (or MME) is recommended to use for small sample size( n
100) while MLE is good for large sample size.
Statistical Correction of Numerical Model Forecasts for Typhoon Tracks
Sohn, Keon-Tae ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 295~304
DOI : 10.5351/CKSS.2005.12.2.295
This paper concentrates on the prediction of typhoon tracks using the dynamic linear model (DLM) for the statistical correction of the numerical model guidance used in the JMA. The DLM with proposed forecast strategy is applied to reduce their systematic errors using the latest observation. All parameters of the DLM are updated dynamically and backward forecasting is performed to remove the effect of initial values.
Estimating the Number of Clusters using Hotelling's
Choi, Kyung-Mee ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 305~312
DOI : 10.5351/CKSS.2005.12.2.305
In the cluster analysis, Hotelling's
can be used to estimate the unknown number of clusters based on the idea of multiple comparison procedure. Especially, its threshold is obtained according to the probability of committing the type one error. Examples are used to compare Hotelling's
with other classical location test statistics such as Sum-of-Squared Error and Wilks'
The hierarchical clustering is used to reveal the underlying structure of the data. Also related criteria are reviewed in view of both the between variance and the within variance.
Collapsibility and Suppression for Cumulative Logistic Model
Hong, Chong-Sun ; Kim, Kil-Tae ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 313~322
DOI : 10.5351/CKSS.2005.12.2.313
In this paper, we discuss suppression for logistic regression model. Suppression for linear regression model was defined as the relationship among sums of squared for regression as well as correlation coefficients of. variables. Since it is not common to obtain simple correlation coefficient for binary response variable of logistic model, we consider cumulative logistic models with multinomial and ordinal response variables rather than usual logistic model. As number of category of a response variable for the cumulative logistic model gets collapsed into binary, it is found that suppressions for these logistic models are changed. These suppression results for cumulative logistic models are discussed and compared with those of linear model.
Bayesian Estimation for Skew Normal Distributions Using Data Augmentation
Kim Hea-Jung ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 323~333
DOI : 10.5351/CKSS.2005.12.2.323
In this paper, we develop a MCMC method for estimating the skew normal distributions. The method utilizing the data augmentation technique gives a simple way of inferring the distribution where fully parametric frequentist approaches are not available for small to moderate sample cases. Necessary theories involved in the method and computation are provided. Two numerical examples are given to demonstrate the performance of the method.
Consensus Clustering for Time Course Gene Expression Microarray Data
Kim, Seo-Young ; Bae, Jong-Sung ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 335~348
DOI : 10.5351/CKSS.2005.12.2.335
The rapid development of microarray technologies enabled the monitoring of expression levels of thousands of genes simultaneously. Recently, the time course gene expression data are often measured to study dynamic biological systems and gene regulatory networks. For the data, biologists are attempting to group genes based on the temporal pattern of their expression levels. We apply the consensus clustering algorithm to a time course gene expression data in order to infer statistically meaningful information from the measurements. We evaluate each of consensus clustering and existing clustering methods with various validation measures. In this paper, we consider hierarchical clustering and Diana of existing methods, and consensus clustering with hierarchical clustering, Diana and mixed hierachical and Diana methods and evaluate their performances on a real micro array data set and two simulated data sets.
Change-point Estimation with Loess of Means
Kim, Jae-Hee ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 349~357
DOI : 10.5351/CKSS.2005.12.2.349
We suggest a functional technique with loess smoothing for estimating the change-point when there is one change-point in the mean model. The proposed change-point estimator is consistent. Simulation study shows a good performance of the proposed change-point estimator in comparison with other parametric or nonparametric change-point estimators.
Model Checking for Time-Series Count Data
Lee, Sung-Im ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 359~364
DOI : 10.5351/CKSS.2005.12.2.359
This paper considers a specification test of conditional Poisson regression model for time series count data. Although conditional models for count data have received attention and proposed in several ways, few studies focused on checking its adequacy. Motivated by the test of martingale difference assumption, a specification test via Ljung-Box statistic is proposed in the conditional model of the time series count data. In order to illustrate the performance of Ljung- Box test, simulation results will be provided.
Variance Estimation for Imputed Survey Data using Balanced Repeated Replication Method
Lee, Jun-Suk ; Hong, Tae-Kyong ; Namkung, Pyong ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 365~379
DOI : 10.5351/CKSS.2005.12.2.365
Balanced Repeated Replication(BRR) is widely used to estimate the variance of linear or nonlinear estimators from complex sampling surveys. Most of survey data sets include imputed missing values and treat the imputed values as observed data. But applying the standard BRR variance estimation formula for imputed data does not produce valid variance estimators. Shao, Chen and Chen(1998) proposed an adjusted BRR method by adjusting the imputed data to produce more accurate variance estimators. In this paper, another adjusted BRR method is proposed with examples of real data.
Use of Random Coefficient Model for Fruit Bearing Prediction in Crop Insurance
Park Heungsun ; Jun Yong-Bum ; Gil Young-Soo ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 381~394
DOI : 10.5351/CKSS.2005.12.2.381
In order to estimate the damage of orchards due' to natural disasters such as typhoon, severe rain, freezing or frost, it is necessary to estimate the number of fruit bearing before and after the damage. To estimate the fruit bearing after the damages are easily done by delegations, but it cost too high to survey every insured farm household and calculate the fruit bearing before the damage. In this article, we suggest to use a random coefficient model to predict the numbers of fruit bearing in the orchards before the damage based on the tree age and the area information.
A Study on Noninformative Priors of Intraclass Correlation Coefficients in Familial Data
Jin, Bong-Soo ; Kim, Byung-Hwee ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 395~411
DOI : 10.5351/CKSS.2005.12.2.395
In this paper, we develop the Jeffreys' prior, reference prior and the the probability matching priors for the difference of intraclass correlation coefficients in familial data. e prove the sufficient condition for propriety of posterior distributions. Using marginal posterior distributions under those noninformative priors, we compare posterior quantiles and frequentist coverage probability.
Robust Cross Validation Score
Park, Dong-Ryeon ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 413~423
DOI : 10.5351/CKSS.2005.12.2.413
Consider the problem of estimating the underlying regression function from a set of noisy data which is contaminated by a long tailed error distribution. There exist several robust smoothing techniques and these are turned out to be very useful to reduce the influence of outlying observations. However, no matter what kind of robust smoother we use, we should choose the smoothing parameter and relatively less attention has been made for the robust bandwidth selection method. In this paper, we adopt the idea of robust location parameter estimation technique and propose the robust cross validation score functions.
A Note on Statistical Reports on the Korean Anthropometric Survey
Park Jinwoo ; Lee Eun-kyung ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 425~433
DOI : 10.5351/CKSS.2005.12.2.425
Most of national-wide surveys are summarized by some statistical tables and graphs. In spite of high costs to get statistical results from surveys, we often find some statistical problems in the statistical reports. In this paper, we point out some statistical problems for the Korean Anthropometric Survey report. Also, we suggest some alternatives which may avoid the illustrated problems.
A General Solution of the Integral Equation for Erlang Distribution
Lee Yoon Dong ; Choi Hyemi ; Lee Eun-kyung ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 435~442
DOI : 10.5351/CKSS.2005.12.2.435
The mathematical properties of the sequentially operated systems are often described by integral equations. Reservoir system of a product and sequential probability ratio test (SPRT) are typical examples of sequentially operated systems. When the underlying random quantities follow Erlang distribution, a systematic method was developed to solve the integral equations. We extend the method to the cases having accrual functions of more general types. The solutions of the integral equations are represented as a linear combination of distribution functions, and the coefficients of the linear combination are obtained by solving linear system derived from the continuity condition of the solutions.
Wakeby Distribution and the Maximum Likelihood Estimation Algorithm in Which Probability Density Function Is Not Explicitly Expressed
Park Jeong-Soo ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 443~451
DOI : 10.5351/CKSS.2005.12.2.443
The studied in this paper is a new algorithm for searching the maximum likelihood estimate(MLE) in which probability density function is not explicitly expressed. Newton-Raphson's root-finding routine and a nonlinear numerical optimization algorithm with constraint (so-called feasible sequential quadratic programming) are used. This algorithm is applied to the Wakeby distribution which is importantly used in hydrology and water resource research for analysis of extreme rainfall. The performance comparison between maximum likelihood estimates and method of L-moment estimates (L-ME) is studied by Monte-carlo simulation. The recommended methods are L-ME for up to 300 observations and MLE for over the sample size, respectively. Methods for speeding up the algorithm and for computing variances of estimates are discussed.
Influence of an Observation on the t-statistic
Kim, Hong-Gie ; Kim, Kyung-Hee ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 453~462
DOI : 10.5351/CKSS.2005.12.2.453
We derive the influence function on t statistic and find its feature; the influence function on t statistic has two forms depending on the value of
. Sample influence functions are used to verify the validity of the derived influence function. We use random samples from normal distribution to show the validity of the function. The simulation study proves that the obtained influence function is very accurate to in estimating changes in t statistic when an observation is added or deleted.
A Sequence of Improvement over the Lindley Type Estimator with the Cases of Unknown Covariance Matrices
Kim, Byung-Hwee ; Baek, Hoh-Yoo ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 463~472
DOI : 10.5351/CKSS.2005.12.2.463
In this paper, the problem of estimating a p-variate (p
4) normal mean vector is considered in decision-theoretic set up. Using a simple property of the noncentral chi-square distribution, a sequence of estimators dominating the Lindley type estimator with the cases of unknown covariance matrices has been produced and each improved estimator is better than previous one.
Likelihood Ratio Criterion for Testing Sphericity from a Multivariate Normal Sample with 2-step Monotone Missing Data Pattern
Choi, Byung-Jin ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 473~481
DOI : 10.5351/CKSS.2005.12.2.473
The testing problem for sphericity structure of the covariance matrix in a multivariate normal distribution is introduced when there is a sample with 2-step monotone missing data pattern. The maximum likelihood method is described to estimate the parameters on the basis of the sample. Using these estimates, the likelihood ratio criterion for testing sphericity is derived.
Bayesian Changepoints Detection for the Power Law Process with Binary Segmentation Procedures
Kim Hyunsoo ; Kim Seong W. ; Jang Hakjin ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 483~496
DOI : 10.5351/CKSS.2005.12.2.483
We consider the power law process which is assumed to have multiple changepoints. We propose a binary segmentation procedure for locating all existing changepoints. We select one model between the no-changepoints model and the single changepoint model by the Bayes factor. We repeat this procedure until no more changepoints are found. Then we carry out a multiple test based on the Bayes factor through the intrinsic priors of Berger and Pericchi (1996) to investigate the system behaviour of failure times. We demonstrate our procedure with a real dataset and some simulated datasets.
A Study on K -Means Clustering
Bae, Wha-Soo ; Roh, Se-Won ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 497~508
DOI : 10.5351/CKSS.2005.12.2.497
This paper aims at studying on K-means Clustering focusing on initialization which affect the clustering results in K-means cluster analysis. The four different methods(the MA method, the KA method, the Max-Min method and the Space Partition method) were compared and the clustering result shows that there were some differences among these methods, especially that the MA method sometimes leads to incorrect clustering due to the inappropriate initialization depending on the types of data and the Max-Min method is shown to be more effective than other methods especially when the data size is large.
A Study on a Statistical Matching Method Using Clustering for Data Enrichment
Kim Soon Y. ; Lee Ki H. ; Chung Sung S. ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 509~520
DOI : 10.5351/CKSS.2005.12.2.509
Data fusion is defined as the process of combining data and information from different sources for the effectiveness of the usage of useful information contents. In this paper, we propose a data fusion algorithm using k-means clustering method for data enrichment to improve data quality in knowledge discovery in database(KDD) process. An empirical study was conducted to compare the proposed data fusion technique with the existing techniques and shows that the newly proposed clustering data fusion technique has low MSE in continuous fusion variables.
Bootstrap Method for Row and Column Effects Model
Jeong, Hyeong-Chul ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 521~529
DOI : 10.5351/CKSS.2005.12.2.521
In this paper, we consider a bootstrap method to the 'row and column effects model' (RC model) to analyze a contingency table with ordered variables. We propose a bootstrap procedure for testing of independence, equality of intervals, and goodness of fit in the RC model. A real data example is included.
Revising K-Means Clustering under Semi-Supervision
Huh Myung-Hoe ; Yi SeongKeun ; Lee Yonggoo ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 531~538
DOI : 10.5351/CKSS.2005.12.2.531
In k-means clustering, we standardize variables before clustering and iterate two steps: units allocation by Euclidean sense and centroids updating. In applications to DB marketing where clusters are to be used as customer segments with similar consumption behaviors, we frequently acquire additional variables on the customers or the units through marketing campaigns a posteriori. Hence we need to modify the clusters originally formed after each campaign. The aim of this study is to propose a revision method of k-means clusters, incorporating added information by weighting clustering variables. We illustrate the proposed method in an empirical case.
One-step Least Squares Fitting of Variogram
Choi, Hye-Mi ;
Communications for Statistical Applications and Methods, volume 12, issue 2, 2005, Pages 539~544
DOI : 10.5351/CKSS.2005.12.2.539
In this paper, we propose the one-step least squares method based on the squared differences to estimate the parameters of the variogram used for spatial data modelling, and discuss its asymptotic efficiency. The proposed method does not require to specify lags of interest and partition lags, so that we can delete the subjectiveness and ambiguity originated from the lag selection in estimating spatial dependence.