Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Korean Journal of Applied Statistics
Journal Basic Information
Journal DOI :
The Korean Statistical Society
Editor in Chief :
Volume & Issues
Volume 24, Issue 6 - Dec 2011
Volume 24, Issue 5 - Oct 2011
Volume 24, Issue 4 - Aug 2011
Volume 24, Issue 3 - Jun 2011
Volume 24, Issue 2 - Apr 2011
Volume 24, Issue 1 - Feb 2011
Selecting the target year
Intervention Analysis of Korea Tourism Data
Kim, Su-Yong ; Seong, Byeong-Chan ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 735~743
DOI : 10.5351/KJAS.2011.24.5.735
This study analyzes inbound and outbound Korea tourism data through an intervention model. For the analysis, we adopt three intervention factors: (1) IMF bailout crisis in December 1997, (2) Severe Acute Respiratory Syndrome(SARS) outbreak in March 2003, and (3) Lehman Brothers bankruptcy in September 2008. The empirical results show that only the SARS factor lowered inbound tourism from April 2003 with a drastic decline in May 2003 and gradually decaying since then. However, all three factors significantly lowered tourism in the case of outbound tourism. Especially, the effect of the IMF is shown to be permanent from December 1997 and the effects of SARS and the Lehman Brothers bankruptcy abrupt and temporary with a gradual decay.
Construction of an Economic Sentiment Indicator for the Korean Economy
Moon, Hye-Jung ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 745~758
DOI : 10.5351/KJAS.2011.24.5.745
An Economic Sentiment Indicator(ESI) is a composite indicator of business survey indices(BSI) and consumer survey indices(CSI). The ESI designed to reflect economic agents' (this includes producers and consumers) overall perceptions of economic activity in a one-dimensional index. The European Commission has published an ESI since 1985. This paper demonstrates the construction of an ESI for the Korean economy. The BSI and CSI components (having a high correlation and a leading feature with respect to GDP) are selected to construct the ESI and they are aggregated using a weighted average and then scaled to have a long-term average of 100 and a standard deviation of 10. Thus values greater than 100 indicate an above-average economic sentiment and vice versa. The newly constructed Korean ESI that extends to January 2003 shows a good tracking performance of GDP and adequately reflects the overall perception of economic activity.
Tree-Structured Nonlinear Regression
Chang, Young-Jae ; Kim, Hyeon-Soo ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 759~768
DOI : 10.5351/KJAS.2011.24.5.759
Tree algorithms have been widely developed for regression problems. One of the good features of a regression tree is the flexibility of fitting because it can correctly capture the nonlinearity of data well. Especially, data with sudden structural breaks such as the price of oil and exchange rates could be fitted well with a simple mixture of a few piecewise linear regression models. Now that split points are determined by chi-squared statistics related with residuals from fitting piecewise linear models and the split variable is chosen by an objective criterion, we can get a quite reasonable fitting result which goes in line with the visual interpretation of data. The piecewise linear regression by a regression tree can be used as a good fitting method, and can be applied to a dataset with much fluctuation.
A Method for Construction of Life Table in Korea
Park, You-Sung ; Kim, Seong-Yong ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 769~789
DOI : 10.5351/KJAS.2011.24.5.769
The life table is a statistical model for life expectancy and reflects mortality experiences exposed to a particular group of people. The following three issues are prerequisite for constructing the life table : a selection of how to estimate the death probability from observed death rates, a graduation method to smooth irregularity of the death probabilities, and an extension method of the death probabilities for oldest-old ages. To construct the life table that is fittest to Korean mortality experiences, we examine five estimation methods such as Chiang's and Greville's for the death probability, three graduation techniques including Beer's and Greville's formulae, and twelve mathematical functions for the extension of death probabilities for oldest-old ages. We also propose a method to resolve the cross-over problem arising from construction the life table.
Semiparametric Seasonal Cointegrating Rank Selection
Seong, Byeong-Chan ; Ahn, Sung-K. ; Ch, Sin-Sup ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 791~797
DOI : 10.5351/KJAS.2011.24.5.791
This paper considers the issue of seasonal cointegrating rank selection by information criteria as the extension of Cheng and Phillips (2009). The method does not require the specification of lag length in vector autoregression, is convenient in empirical work, and is in a semiparametric context because it allows for a general short memory error component in the model with only lags related to error correction terms. Some limit properties of usual information criteria are given for the rank selection and small Monte Carlo simulations are conducted to evaluate the performances of the criteria.
Impact of Structural Shock and Estimation of Dynamic Response between Variables
Cho, Eun-Jung ; Kim, Tae-Ho ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 799~807
DOI : 10.5351/KJAS.2011.24.5.799
This study investigates long and short run responses of variables to exogenous shocks by imposing prior restrictions on a contemporaneous structural shock coefficient matrix of the model to identify shocks by endogenous variables in the vector autoregression. The relative importance of each structural shock in variation of each variable is calculated through the identification of proper restrictions (not based on any specific theory but on researcher judgment corresponding to actual situations) and an estimation of the structural vector autoregression. The results of the analyses are found to maintain consistency.
VaR Estimation with Multiple Copula Functions
Hong, Chong-Sun ; Lee, Won-Yong ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 809~820
DOI : 10.5351/KJAS.2011.24.5.809
VaR(Value at risk) is a measure of market risk management and needs to be estimated for multiple distributions. In this paper, Copula functions are used to generate distributions of multivariate random variables. The dependence structure of random variables is classified by the exchangeable Copula, fully nested Copula, partially nested Copula. For the earning rate data of four Korean industries, the parameters of the Archimedean Copula functions including Clayton, Gumbel and Frank Copula are estimated by using three kinds of dependence structure. These Copula functions are then fitted to to the data so that corresponding VaR are obtained and explored.
Asymmetric CCC Modelling in Multivariate-GARCH with Illustrations of Multivariate Financial Data
Park, R.H. ; Choi, M.S. ; Hwan, S.Y. ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 821~831
DOI : 10.5351/KJAS.2011.24.5.821
It has been relatively incomplete in the field of financial time series to adapt asymmetric features to multivar ate GARCH processes (McAleer et al., 2009). Retaining constant conditional correlation(CCC) structure, this article pursues to introduce asymmetric GARCH modelling in analysing multivariate volatilities in time series in a practical point of view. Multivariate Korean financial time series are analyzed in detail to compar our theory with conventional methodologies including GARCH and EGARCH.
A Bayesian Extreme Value Analysis of KOSPI Data
Yun, Seok-Hoon ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 833~845
DOI : 10.5351/KJAS.2011.24.5.833
This paper conducts a statistical analysis of extreme values for both daily log-returns and daily negative log-returns, which are computed using a collection of KOSPI data from January 3, 1998 to August 31, 2011. The Poisson-GPD model is used as a statistical analysis model for extreme values and the maximum likelihood method is applied for the estimation of parameters and extreme quantiles. To the Poisson-GPD model is also added the Bayesian method that assumes the usual noninformative prior distribution for the parameters, where the Markov chain Monte Carlo method is applied for the estimation of parameters and extreme quantiles. According to this analysis, both the maximum likelihood method and the Bayesian method form the same conclusion that the distribution of the log-returns has a shorter right tail than the normal distribution, but that the distribution of the negative log-returns has a heavier right tail than the normal distribution. An advantage of using the Bayesian method in extreme value analysis is that there is nothing to worry about the classical asymptotic properties of the maximum likelihood estimators even when the regularity conditions are not satisfied, and that in prediction it is effective to reflect the uncertainties from both the parameters and a future observation.
Two-Stage Logistic Regression for Cancer Classi cation and Prediction from Copy-Numbe Changes in cDNA Microarray-Based Comparative Genomic Hybridization
Kim, Mi-Jung ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 847~859
DOI : 10.5351/KJAS.2011.24.5.847
cDNA microarray-based comparative genomic hybridization(CGH) data includes low-intensity spots and thus a statistical strategy is needed to detect subtle differences between different cancer classes. In this study, genes displaying a high frequency of alteration in one of the different classes were selected among the pre-selected genes that show relatively large variations between genes compared to total variations. Utilizing copy-number changes of the selected genes, this study suggests a statistical approach to predict patients' classes with increased performance by pre-classifying patients with similar genetic alteration scores. Two-stage logistic regression model(TLRM) was suggested to pre-classify homogeneous patients and predict patients' classes for cancer prediction; a decision tree(DT) was combined with logistic regression on the set of informative genes. TLRM was constructed in cDNA microarray-based CGH data from the Cancer Metastasis Research Center(CMRC) at Yonsei University; it predicted the patients' clinical diagnoses with perfect matches (except for one patient among the high-risk and low-risk classified patients where the performance of predictions is critical due to the high sensitivity and specificity requirements for clinical treatments. Accuracy validated by leave-one-out cross-validation(LOOCV) was 83.3% while other classification methods of CART and DT performed as comparisons showed worse performances than TLRM.
Exploration of the Gene-Gene Interactions Using the Relative Risks in Distinct Genotypes
Jung, Ji-Won ; Yee, Jae-Yong ; Lee, Suk-Hoon ; Pa, Mi-Ra ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 861~869
DOI : 10.5351/KJAS.2011.24.5.861
One of the main objects of recent genetic studies is to understand genetic factors that induce complex diseases. If there are interactions between loci, it is difficult to find such associations through a single-locus analysis strategy. Thus we need to consider the gene-gene interactions and/or gene-environment interactions. The MDR(multifactor dimensionality reduction) method is being used frequently; however, it is not appropriate to detect interactions caused by a small fraction of the possible genotype pairs. In this study, we propose a relative risk interaction explorer that detects interactions through the calculation of the relative risks between the control and disease groups from each genetic combinations. For illustration, we apply this method to MDR open source data. We also compare the MDR and the proposed method using the simulated data eight genetic models.
Logistic Regression Method in Interval-Censored Data
Yun, Eun-Young ; Kim, Jin-Mi ; Ki, Choong-Rak ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 871~881
DOI : 10.5351/KJAS.2011.24.5.871
In this paper we propose a logistic regression method to estimate the survival function and the median survival time in interval-censored data. The proposed method is motivated by the data augmentation technique with no sacrifice in augmenting data. In addition, we develop a cross validation criterion to determine the size of data augmentation. We compare the proposed estimator with other existing methods such as the parametric method, the single point imputation method, and the nonparametric maximum likelihood estimator through extensive numerical studies to show that the proposed estimator performs better than others in the sense of the mean squared error. An illustrative example based on a real data set is given.
Testing Log Normality for Randomly Censored Data
Kim, Nam-Hyun ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 883~891
DOI : 10.5351/KJAS.2011.24.5.883
For survival data we sometimes want to test a log normality hypothesis that can be changed into normality by transforming the survival data. Hence the Shapiro-Wilk type statistic for normality is generalized to randomly censored data based on the Kaplan-Meier product limit estimate of the distribution function. Koziol and Green (1976) derived Cram
r-von Mises statistic's randomly censored version under the simpl hypothesis. These two test statistics are compared through a simulation study. As for the distribution of censoring variables, we consider Koziol and Green (1976)'s model and other similar models. Through the simulation results, we can see that the power of the proposed statistic is higher than that of Koziol-Green statistic and that the proportion of the censored observations (rather than the distribution of censoring variables) has a strong influence on the power of the proposed statistic.
Visualization for Experimental Designs
Jang, Dae-Heung ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 893~904
DOI : 10.5351/KJAS.2011.24.5.893
The lecture of the experimental designs consists of two main part-experimental designs and model analysis. Mostly, the progress of the visualization has been made on a model analysis. As the visualization of experimental designs, we can consider the visualization of Latin squares, supersaturated designs, and balanced incomplete block designs. We can propose the design plots as well as use the scatterplots and the scatterplot matrices for the visualization of experimental designs. Through the visualization of experimental designs, we can use the synergy effect in teaching the lecture of the experimental designs.
Estimation of the Noise Variance in Image and Noise Reduction
Kim, Yeong-Hwa ; Nam, Ji-Ho ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 905~914
DOI : 10.5351/KJAS.2011.24.5.905
In the field of image processing, the removal noise contamination from the original image is essential. However, due to various reasons, the occurrence of the noise is practically impossible to prevent completely. Thus, the reduction of the noise contained in images remains important. In this study, we estimate the level of noise variance based on the measurement of the relative strength of the noise, and we propose a noise reduction algorithm that uses a sigma filter. As a result, the proposed statistical noise reduction methodology provides significantly improved results over the usual sigma filtering regardless of the level of the noise variance.
A Statistical Study on Korean Baseball League Games
Choi, Young-Gun ; Kim, Hyoung-Moon ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 915~930
DOI : 10.5351/KJAS.2011.24.5.915
There are a variety of methods to model game results and many methods exist for the case of paired comparison data. Among them, the Bradley-Terry model is the most widely used to derive a latent preference scale from paired comparison data. It has been applied in a variety of fields in psychology and related disciplines. We applied this model to the data of Korean Baseball League. It shows that the loglinear Bradley-Terry model of defensive rate and save is optimal in terms of AIC. Also some categorical characteristics, such as east team and west team, existence of golden glove winning players, team(s) with seasonal pitching leader, and team(s) with home advantage, influenced the game result significantly. As a result, the suggested models can be further utilized to predict future game results.
An Alternative Study of the Determination of the Threshold for the Generalized Pareto Distribution
Yoon, Jeong-Yoen ; Cho, Jae-Beom ; Jun, Byoung-Cheol ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 931~939
DOI : 10.5351/KJAS.2011.24.5.931
In practice, thresholds are determined by the two subjective assessment methods in a generalized pareto distribution of mean extreme function(MEF-graph) or Hill-graph. To remedy the problem of subjectiveness of these methods, we propose an alternative method to determine the threshold based on the robust statistics. We compared the MEF-graph, Hill-graph and our method through VaRs on the Korean stock market data from January 5, 1987 to August 3, 2009. As a result, the VaR based on the proposed method is not much different from the existing methods, and the standard deviation of VaR for our method was the smallest. The results show that our method can be a promising alternative to determine thresholds of the generalized pareto distributions.
Derivation and Application of In uence Function in Discriminant Analysis for Three Groups
Lee, Hae-Jung ; Kim, Hong-Gie ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 941~949
DOI : 10.5351/KJAS.2011.24.5.941
The influence function is used to develop criteria to detect outliers in discriminant analysis. We derive the influence function of observations that estimate the the misclassification probability in discriminant analysis for three groups. The proposed measures are applied to the facial image data to define outliers and redo the discriminant analysis excluding the outliers. The study proves that the derived influence function is more efficient than using the discriminant probability approach.
Bayesian Inference for the Zero In ated Negative Binomial Regression Model
Shim, Jung-Suk ; Lee, Dong-Hee ; Jun, Byoung-Cheol ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 951~961
DOI : 10.5351/KJAS.2011.24.5.951
In this paper, we propose a Bayesian inference using the Markov Chain Monte Carlo(MCMC) method for the zero inflated negative binomial(ZINB) regression model. The proposed model allows the regression model for zero inflation probability as well as the regression model for the mean of the dependent variable. This extends the work of Jang et al. (2010) to the fully defiend ZINB regression model. In addition, we apply the proposed method to a real data example, and compare the efficiency with the zero inflated Poisson model using the DIC. Since the DIC of the ZINB is smaller than that of the ZIP, the ZINB model shows superior performance over the ZIP model in zero inflated count data with overdispersion.
Modified Recursive PC
Kim, Dong-Gyu ; Kim, Ah-Hyoun ; Kim, Hyun-Joong ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 963~977
DOI : 10.5351/KJAS.2011.24.5.963
PCA(Principal Component Analysis) is a well-studied statistical technique and an important tool for handling multivariate data. Although many algorithms exist for PCA, most of them are unsuitable for real time applications or high dimensional problems. Since it is desirable to avoid extensive matrix operations in such cases, alternative solutions are required to calculate the eigenvalues and eigenvectors of the sample covariance matrix. Erdogmus et al. (2004) proposed Recursive PCA(RPCA), which is a fast adaptive on-line solution for PCA, based on the first order perturbation theory. It facilitates the real-time implementation of PCA by recursively approximating updated eigenvalues and eigenvectors. However, the performance of the RPCA method becomes questionable as the size of newly-added data increases. In this paper, we modified the RPCA method by taking advantage of the mathematical relation of eigenvalues and eigenvectors of sample covariance matrix. We compared the performance of the proposed algorithm with that of RPCA, and found that the accuracy of the proposed method remarkably improved.
Outlier Detection Using Dynamic Plots
Ahn, Byung-Jin ; Seo, Han-Son ;
Korean Journal of Applied Statistics, volume 24, issue 5, 2011, Pages 979~986
DOI : 10.5351/KJAS.2011.24.5.979
A linear regression method is commonly used to analyze data because of its simplicity and applicability; however, it is well known that data may contain some outliers and influential cases that may have a harmful effect on a statistical analysis. Thus detection and examination of outliers or influential cases are important parts of data analysis. In detecting multiple outliers, masking effects usually occur and make it difficult to identify the true outliers. We propose to use dynamic plots as a method resistant to masking effect. The procedure using dynamic plots is useful to find appropriate basic sets with which a dependent outliers detection method start and detect a true outliers set. Examples are given to demonstrate the effectiveness of the suggested idea.