Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Korean Journal of Applied Statistics
Journal Basic Information
Journal DOI :
The Korean Statistical Society
Editor in Chief :
Volume & Issues
Volume 27, Issue 7 - Dec 2014
Volume 27, Issue 6 - Dec 2014
Volume 27, Issue 5 - Oct 2014
Volume 27, Issue 4 - Aug 2014
Volume 27, Issue 3 - Jun 2014
Volume 27, Issue 2 - Apr 2014
Volume 27, Issue 1 - Feb 2014
Selecting the target year
Standardizing Unstructured Big Data and Visual Interpretation using MapReduce and Correspondence Analysis
Choi, Joseph ; Choi, Yong-Seok ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 169~183
DOI : 10.5351/KJAS.2014.27.2.169
Massive and various types of data recorded everywhere are called big data. Therefore, it is important to analyze big data and to nd valuable information. Besides, to standardize unstructured big data is important for the application of statistical methods. In this paper, we will show how to standardize unstructured big data using MapReduce which is a distribution processing system. We also apply simple correspondence analysis and multiple correspondence analysis to nd the relationship and characteristic of direct relationship words for Samsung Electronics and The Korea Economic Daily newspaper as well as Apple Inc.
New Method for Preference Measurement in Ranking-based Conjoint Analysis
Kim, Bu-Yong ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 185~195
DOI : 10.5351/KJAS.2014.27.2.185
Ranking-based conjoint analysis is widely used in various fields such as marketing research. While the ranking-based conjoint affords several advantages over the rating-based or choice-based conjoint, it has a serious shortcoming that respondents have much difficulty in ranking the product profiles in order of preference when many profiles are involved. This article suggests a new method for the preference measurement to improve the response efficiency. The method employs the concept of ranking sets that let the respondent evaluate a small number of profiles at a time. Through the proposed method, preference rankings of profiles obtained from each ranking set are aggregated to generate overall rankings. The balanced incomplete block design is expanded and transformed to the dual design in order to construct well-balanced ranking sets that can accommodate a large number of profiles. The proposed method is applied to the analysis of consumer preferences for perfume-for-women.
The Effect of the Selection Attribute of Golf Course on Customer Satisfaction and Customer Loyalty
Han, Yoon Sang ; Kim, Yon Hyong ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 197~209
DOI : 10.5351/KJAS.2014.27.2.197
This paper analyzes the relationship of the selection attribute of golf course, the service quality, service value, customer loyalty and revisiting golf course, which is a customer loyalty and orally transmitted effect. The selection attribute of golf course such as convenience, cost, course condition and service has a significant effect on service quality, service value, customer satisfaction and customer loyalty. Service quality has a significant effect on service value, customer satisfaction and customer loyalty. It is estimated that customer satisfaction has a significant effect on customer loyalty.
Benefit-Cost Analysis of National Pensioners by Income and Life Expectancy
Han, Jeonglim ; Lee, Hangsuck ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 211~226
DOI : 10.5351/KJAS.2014.27.2.211
This paper discusses life expectancy differentials of beneficiaries of national pension old-age benefit and benefit-cost analysis in Korea. These results are useful indicators for the assessment of retirement income security of beneficiaries and old-age benefits. This paper analyzes benefit-cost ratio, internal rate of return and generation transfer amount, using life tables by lifetime incomes. The result of the actuarial analysis for male life expectancy is approximately 21.69 to 24.63 years. The result of the actuarial analysis for female life expectancy is approximately 27.63 to 29.81 years. The result of the actuarial analysis of low income level is that the benefit-cost ratio is lower approximately 2.68 to 4.83%, the internal rate of return lower approximately 0.00 to 0.74%, the generation transfer amount lower approximately 3.00 to 5.74%, than total income level. The result of the actuarial analysis of high income level is that the benefit-cost ratio is higher approximately 2.07 to 4.98%, the internal rate of return higher approximately 0.03 to 1.73%, the generation transfer amount higher approximately 2.53 to 9.68%, than the total income level. The results by income varies due to the effect of income redistribution and life expectancy on the national pension.
DD-Plot for ANCOVA Models
Jang, Dae-Heung ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 227~237
DOI : 10.5351/KJAS.2014.27.2.227
We use the regression model with the indicator variables in the case that we use qualitative variables as some predictor variables in regression analysis. We use the ANCOVA(Analysis of Covariance) model when comparing the response variable among groups while statistically controlling for variation in the response variable caused by a variation in the covariate. DD-plot can be used as a graphical exploratory data analysis tool before the confirmatory data analysis. With the DD-plot, we can discriminate the difference of groups in the regression model with the indicator variables or the ANCOVA model at a glance. Making DD-plot does not demand the statistical model assumption about error terms in regression model. Several examples show the usefulness of DD-plots as a graphical exploratory data analysis tool for the regression analysis.
An Additive Stratified Quantitative Attribute Randomized Response Model
Lee, Gi-Sung ; Ahn, Seung-Chul ; Hong, Ki-Hak ; Son, Chang-Kyoon ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 239~247
DOI : 10.5351/KJAS.2014.27.2.239
For a sensitive survey in which the population is composed by several strata with quantitative attributes, we present an additive stratified quantitative attribute randomized response model which applied stratified random sampling instead of simple random sampling to the models of Himmelfarb-Edgell`s additive quantitative attribute model and Gjestvang-Singh`s. We also establish theoretical grounds to estimate the stratum mean of sensitive quantitative attributes as well as the over all mean. We deal with the proportional and optimal allocation problems in each suggested model and compare the relative efficiency of the suggested two models; subsequently, Himmelfarb-Edgell`s model is more efficient than Gjestvang-Singh`s model under the condition of stratified random sampling.
Measuring Hadoop Optimality by Lorenz Curve
Kim, Woo-Cheol ; Baek, Changryong ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 249~261
DOI : 10.5351/KJAS.2014.27.2.249
Ever increasing "Big data" can only be effectively processed by parallel computing. Parallel computing refers to a high performance computational method that achieves effectiveness by dividing a big query into smaller subtasks and aggregating results from subtasks to provide an output. However, it is well-known that parallel computing does not achieve scalability which means that performance is improved linearly by adding more computers because it requires a very careful assignment of tasks to each node and collecting results in a timely manner. Hadoop is one of the most successful platforms to attain scalability. In this paper, we propose a measurement for Hadoop optimization by utilizing a Lorenz curve which is a proxy for the inequality of hardware resources. Our proposed index takes into account the intrinsic overhead of Hadoop systems such as CPU, disk I/O and network. Therefore, it also indicates that a given Hadoop can be improved explicitly and in what capacity. Our proposed method is illustrated with experimental data and substantiated by Monte Carlo simulations.
Bayesian Hierarchical Mixed Effects Analysis of Time Non-Homogeneous Markov Chains
Sung, Minje ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 263~275
DOI : 10.5351/KJAS.2014.27.2.263
The present study used a hierarchical Bayesian approach was used to develop a mixed effect model to describe the transitional behavior of subjects in time nonhomogeneous Markov chains. The posterior distributions of model parameters were not in analytically tractable forms; subsequently, a Gibbs sampling method was used to draw samples from full conditional posterior distributions. The proposed model was implemented with real data.
Comparison of Bias Correction Methods for the Rare Event Logistic Regression
Kim, Hyungwoo ; Ko, Taeseok ; Park, No-Wook ; Lee, Woojoo ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 277~290
DOI : 10.5351/KJAS.2014.27.2.277
We analyzed binary landslide data from the Boeun area with logistic regression. Since the number of landslide occurrences is only 9 out of 5000 observations, this can be regarded as a rare event data. The main issue of logistic regression with the rare event data is a serious bias problem in regression coefficient estimates. Two bias correction methods were proposed before and we quantitatively compared them via simulation. Firth (1993)`s approach outperformed and provided the most stable results for analyzing the rare-event binary data.
A Numerical Study on CUSUM Test for Volatility Shifts Against Long-Range Dependence
Lee, Youngsun ; Lee, Taewook ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 291~305
DOI : 10.5351/KJAS.2014.27.2.291
Persistence is one of the typical characteristics appearing in the volatility of financial time series. According to the recent researches, the volatility persistence may be due to either volatility shifts or long-range dependence. In this paper, we consider residual-based CUSUM tests to distinguish volatility persistence, long-range dependence and volatility shifts in GARCH models. It is observed that this test procedure achieve reasonable powers without a size distortion. Moreover, we employ AIC and BIC criteria to estimate the change points and the number of change points in volatility. We demonstrate the superiority of residual-based CUSUM tests on various Monte Carlo simulations and empirical data analysis.
A Test on a Specific Set of Outlier Candidates in a Linear Model
Seo, Han Son ; Yoon, Min ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 307~315
DOI : 10.5351/KJAS.2014.27.2.307
An exact distribution of the test statistic to test for multiple outlier candidates does not generally exist; therefore, tests of individual outliers (or tests using simulated critical-values) are usually conducted instead of testing for groups of outliers. This article is on procedures to test outlying observations. We suggest a method that can be applied to arbitrary observations or multiple outlier candidates detected by an outlier detecting method. A Monte Carlo study performance is used to compare the proposed method with others.
A Prediction Model for Depression Risk
Kim, Jaeyong ; Min, Byungju ; Lee, Jaehoon ; Chang, Jae Seung ; Ha, Tae Hyon ; Ha, Kyooseob ; Park, Taesung ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 317~330
DOI : 10.5351/KJAS.2014.27.2.317
Bipolar disorder is a psychopathy characterized by manic and major depressive episodes. It is important to determine the degree of depression when treating patients with bipolar disorder because 810% of bipolar patients commit suicide during the periods in which they experience major depressive episodes. The Hamilton depression rating scale is most commonly used to estimate the degree of depression in a patient. This paper proposes using the Hamilton depression rating scale to estimate the effectiveness of patient treatment based on the linear mixed effects model and the transition model. Study subjects were recruited from the Seoul National University Bundang Hospital who scored 8 points or above in the Hamilton depression rating scale on their first medical examination. The linear mixed effects model and the transition model were fitted using the Hamilton depression rating scales measured at the baseline, six month, and twelve month follow-ups. Then, Hamilton depression rating scale at the twenty-four month follow-up was predicted using these models. The prediction models were then evaluated by comparing the observed and predicted Hamilton depression rating scales on the twenty-four month follow-up.
Modeling Clustered Interval-Censored Failure Time Data with Informative Cluster Size
Kim, Jinheum ; Kim, Youn Nam ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 331~343
DOI : 10.5351/KJAS.2014.27.2.331
We propose two estimating procedures to analyze clustered interval-censored data with an informative cluster size based on a marginal model and investigate their asymptotic properties. One is an extension of Cong et al. (2007) to interval-censored data and the other uses the within-cluster resampling method proposed by Hoffman et al. (2001). Simulation results imply that the proposed estimators have a better performance in terms of bias and coverage rate of true value than an estimator with no adjustment of informative cluster size when the cluster size is related with survival time. Finally, they are applied to lymphatic filariasis data adopted from Williamson et al. (2008).
A GLR Chart for Monitoring a Zero-Inflated Poisson Process
Choi, Mi Lim ; Lee, Jaeheon ;
Korean Journal of Applied Statistics, volume 27, issue 2, 2014, Pages 345~355
DOI : 10.5351/KJAS.2014.27.2.345
The number of nonconformities in a unit is commonly modeled by a Poisson distribution. As an extension of a Poisson distribution, a zero-inflated Poisson(ZIP) process can be used to fit count data with an excessive number of zeroes. In this paper, we propose a generalized likelihood ratio(GLR) chart to monitor shifts in the two parameters of the ZIP process. We also compare the proposed GLR chart with the combined cumulative sum(CUSUM) chart and the single CUSUM chart. It is shown that the overall performance of the GLR chart is comparable with CUSUM charts and is significantly better in some cases where the actual directions of the shifts are different from the pre-specified directions in CUSUM charts.