Go to the main menu
Skip to content
Go to bottom
REFERENCE LINKING PLATFORM OF KOREA S&T JOURNALS
> Journal Vol & Issue
Korean Journal of Applied Statistics
Journal Basic Information
Journal DOI :
The Korean Statistical Society
Editor in Chief :
Volume & Issues
Volume 28, Issue 6 - Dec 2015
Volume 28, Issue 5 - Oct 2015
Volume 28, Issue 4 - Aug 2015
Volume 28, Issue 3 - Jun 2015
Volume 28, Issue 2 - Apr 2015
Volume 28, Issue 1 - Feb 2015
Selecting the target year
A Study of Outlier Detection Using the Mixture of Extreme Distributions Based on Deep-Sea Fishery Data
Lee, Jung Jin ; Kim, Jae Kyoung ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 847~858
DOI : 10.5351/KJAS.2015.28.5.847
Deep-sea fishery in the Antarctic Ocean has been actively progressed by the developed countries including Korea. In order to prevent the environmental destruction of the Antarctic Ocean, related countries have established the Commission for the Conservation of Antarctic Marine Living Resources (CCAMLR) and have monitored any illegal unreported or unregulated fishing. Fishing of tooth fish, an expensive fish, in the Antarctic Ocean has increased recently and high catches per unit effort (CPUE) of fishing boats, which is suspicious for an illegal activity, have been frequently reported. The data of CPUEs in a fishing area of the Antarctic Ocean often show an extreme Distribution or a mixture of two extreme distributions. This paper proposes an algorithm to detect an outlier of CPUEs by using the mixture of two extreme distributions. The parameters of the mixture distribution are estimated by the EM algorithm. Log likelihood value and posterior probabilities are used to detect an outlier. Experiments show that the proposed algorithm to detect outlier of the data can be adopted instead of simple criteria such as a CPUE is greater than 1.
The Effects of Breeding Environment Adjustment in FABP4 Gene Identification of Korean Cattle
Kim, Hyun-Ji ; Lee, Jea-Young ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 859~870
DOI : 10.5351/KJAS.2015.28.5.859
Economic-traits of livestock are affected by environmental and genetic factors. We are interested in genetic factors that influence the economic-traits of Korean cattle. It is necessary to adjust environmental factors in order to enhance the accuracy of the genetic effect analysis. In this paper, we propose a statistical model of Korean cattle that exclude environmental breeding farm and age factors. We formulated an adjusted economic-trait value, and applied multifactor dimensionality reduction (MDR) method to data of before-and-after adjustment to identify major FABP4 genes. We were able to increase the accuracy of the analysis after adjustment and identify superior FABP4 genes that influence grade and fatty acid.
Filtered Coupling Measures for Variable Selection in Sparse Vector Autoregressive Modeling
Lee, Seungkyu ; Baek, Changryong ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 871~883
DOI : 10.5351/KJAS.2015.28.5.871
Vector autoregressive (VAR) models in high dimension suffer from noisy estimates, unstable predictions and hard interpretation. Consequently, the sparse vector autoregressive (sVAR) model, which forces many small coefficients in VAR to exactly zero, has been suggested and proven effective for the modeling of high dimensional time series data. This paper studies coupling measures to select non-zero coefficients in sVAR. The basic idea based on the simulation study reveals that removing the effect of other variables greatly improves the performance of coupling measures. sVAR model coefficients are asymmetric; therefore, asymmetric coupling measures such as Granger causality improve computational costs. We propose two asymmetric coupling measures, filtered-cross-correlation and filtered-Granger-causality, based on the filtered residuals series. Our proposed coupling measures are proven adequate for heavy-tailed and high order sVAR models in the simulation study.
Functional Forecasting of Seasonality
Lee, Geung-Hee ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 885~893
DOI : 10.5351/KJAS.2015.28.5.885
It is important to improve the forecasting accuracy of one-year-ahead seasonal factors in order to produce seasonally adjusted series of the following year. In this paper, seasonal factors of 8 monthly Korean economic time series are examined and forecast based on the functional principal component regression. One-year-ahead forecasts of seasonal factors from the functional principal component regression are compared with other forecasting methods based on mean absolute error (MAE) and mean absolute percentage error (MAPE). Forecasting seasonal factors via the functional principal component regression performs better than other comparable methods.
Likelihood Approximation of Diffusion Models through Approximating Brownian Bridge
Lee, Eun-kyung ; Sim, Songyong ; Lee, Yoon Dong ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 895~906
DOI : 10.5351/KJAS.2015.28.5.895
Diffusion is a mathematical tool to explain the fluctuation of financial assets and the movement of particles in a micro time scale. There are ongoing statistical trials to develop an estimation method for diffusion models based on likelihood. When we estimate diffusion models by applying the maximum likelihood estimation method on data observed at discrete time points, we need to know the transition density of the diffusion. In order to approximate the transition densities of diffusion models, we suggests the method to approximate the path integral of the random process with normal random variables, and compare the numerical properties of the method with other approximation methods.
Comparisons of the Performance with Bayes Estimator and MLE for Control Charts Based on Geometric Distribution
Hong, Hwiju ; Lee, Jaeheon ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 907~920
DOI : 10.5351/KJAS.2015.28.5.907
Charts based on geometric distribution are effective to monitor the proportion of nonconforming items in high-quality processes where the in-control proportion nonconforming is low. The implementation of this chart is often based on the assumption that in-control proportion nonconforming is known or accurately estimated. However, accurate parameter estimation is very difficult and may require a larger sample size than that available in practice for high-quality process where the proportion of nonconforming items is very small. An inaccurate estimate of the parameter can result in estimated control limits that cause unreliability in the monitoring process. The maximum likelihood estimator (MLE) is often used to estimate in-control proportion nonconforming. In this paper, we recommend a Bayes estimator for the in-control proportion nonconforming to incorporate practitioner knowledge and avoid estimation issues when no nonconforming items are observed in the Phase I sample. The effects of parameter estimation on the geometric chart and the geometric CUSUM chart are considered when the MLE and the Bayes estimator are used. The results show that chart performance with estimated control limits based on the Bayes estimator is generally better than that based on the MLE.
Analysis of Survivability for Combatants during Offensive Operations at the Tactical Level
Kim, Jaeoh ; Cho, HyungJun ; Kim, GakGyu ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 921~932
DOI : 10.5351/KJAS.2015.28.5.921
This study analyzed military personnel survivability in regards to offensive operations according to the scientific military training data of a reinforced infantry battalion. Scientific battle training was conducted at the Korea Combat Training Center (KCTC) training facility and utilized scientific military training equipment that included MILES and the main exercise control system. The training audience freely engaged an OPFOR who is an expert at tactics and weapon systems. It provides a statistical analysis of data in regards to state-of-the-art military training because the scientific battle training system saves and utilizes all training zone data for analysis and after action review as well as offers training control during the training period. The methodologies used the Cox PH modeling (which does not require parametric distribution assumptions) and decision tree modeling for survival data such as CART, GUIDE, and CTREE for richer and easier interpretation. The variables that violate the PH assumption were stratified and analyzed. Since the Cox PH model result was not easy to interpret the period of service, additional interpretation was attempted through univariate local regression. CART, GUIDE, and CTREE formed different tree models which allow for various interpretations.
A Study on Domestic Drama Rating Prediction
Kang, Suyeon ; Jeon, Heejeong ; Kim, Jihye ; Song, Jongwoo ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 933~949
DOI : 10.5351/KJAS.2015.28.5.933
Audience rating competition in the domestic drama market has increased recently due to the introduction of commercial broadcasting and diversification of channels. There is now a need for thorough studies and analysis on audience rating. Especially, a drama rating is an important measure to estimate advertisement costs for producers and advertisers. In this paper, we study the drama rating prediction models using various data mining techniques such as linear regression, LASSO regression, random forest, and gradient boosting. The analysis results show that initial drama ratings are affected by structural elements such as broadcasting station and broadcasting time. Average drama ratings are also influenced by earlier public opinion such as the number of internet searches about the drama.
Practical Designs, Analysis and Concepts Optimization in Conjoint Analysis
Lim, Yong B. ; Chung, Jong Hee ; Kim, Joo H. ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 951~963
DOI : 10.5351/KJAS.2015.28.5.951
The conjoint analyst in marketing are anxious to know whether there exist synergy or antagonistic effects between two attributes. That is to say, they are interested in estimating the main effects as well as the two factor interaction effects.We research the design of survey questionnaire so that all the main effects and two factor interaction effects are estimable by employing the resolution V balanced Incomplete Block Fractional Factorial Design. We screen vital few effects, find the proper model and obtain information for efficient concepts optimization by analyzing all respondents survey data.
Variable Selection in Frailty Models using FrailtyHL R Package: Breast Cancer Survival Data
Kim, Bohyeon ; Ha, Il Do ; Noh, Maengseok ; Na, Myung Hwan ; Song, Ho-Chun ; Kim, Jahae ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 965~976
DOI : 10.5351/KJAS.2015.28.5.965
Determining relevant variables for a regression model is important in regression analysis. Recently, a variable selection methods using a penalized likelihood with various penalty functions (e.g. LASSO and SCAD) have been widely studied in simple statistical models such as linear models and generalized linear models. The advantage of these methods is that they select important variables and estimate regression coefficients, simultaneously; therefore, they delete insignificant variables by estimating their coefficients as zero. We study how to select proper variables based on penalized hierarchical likelihood (HL) in semi-parametric frailty models that allow three penalty functions, LASSO, SCAD and HL. For the variable selection we develop a new function in the "frailtyHL" R package. Our methods are illustrated with breast cancer survival data from the Medical Center at Chonnam National University in Korea. We compare the results from three variable-selection methods and discuss advantages and disadvantages.
A Study on the Number of Domestic Food Delivery Services
Kwon, Jaeyoung ; Kim, Sinae ; Park, Eungee ; Song, Jongwoo ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 977~990
DOI : 10.5351/KJAS.2015.28.5.977
Food delivery services are well developed in the Republic of Korea, The increase of one person households and the success of app applications influence delivery services these days. We consider a prediction model for the food delivery service based on weather and dates to predict the number of food delivery services in 2014 using various data mining techniques. We use linear regression, random forest, gradient boosting, support vector machines, neural networks, and logistic regression to find the best prediction model. There are four categories of food delivery services and we consider two methods. For the first method, we estimate the total number of delivery services and the posterior probabilities of each delivery service. For the second method, we use different models for each category and combine them to estimate the total number of delivery services. The neural network and linear regression model perform best in the first method, this is followed by the neural network which is the best for the second method. The result shows that we can estimate the number of deliveries accurately based on dates and weather information.
Image Noise Reduction Filter Based on Robust Regression Model
Kim, Yeong-Hwa ; Park, Youngho ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 991~1001
DOI : 10.5351/KJAS.2015.28.5.991
Digital images acquired by digital devices are used in many fields. Applying statistical methods to the processing of images will increase speed and efficiency. Methods to remove noise and image quality have been researched as a basic operation of image processing. This paper proposes a novel reduction method that considers the direction and magnitude of the edge to remove image noise effectively using statistical methods. The proposed method estimates the brightness of pixels relative to pixels in the same direction based on a robust regression model. An estimate of pixel brightness is obtained by weighting the magnitude of the edge that improves the performance of the average filter. As a result of the simulation study, the proposed method retains pixels that are well-characterized and confirms that noise reduction performance is improved over conventional methods.
Explanation of Runs Lost Using Combined Fielding Indices in Korean Professional Baseball
Kim, Hyuk Joo ; Kim, Yea Hyoung ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 1003~1011
DOI : 10.5351/KJAS.2015.28.5.1003
We studied indices to explain runs lost for Korean professional baseball teams. Kim and Kim (2014) studied batting indices to explain run productivity of teams; subsequently, we studied fielding indices to explain runs lost. We considered several combined indices made by combining fielding indices closely connected with the runs lost of teams. Data analysis from all games in the regular seasons of 1982~2014 show that weighted WPH (defined as weighted average of WHIP and number of home runs allowed per game) best explain runs lost. Weighted WPH consisting of WHIP (with weight 81%) and number of home runs allowed per game (with weight 19%) was found optimal weighted WPH having correlation coefficient 0.95033 with average runs lost per game. Analysis by chronological periods gave results not much different.
Performance Comparison of Estimation Methods for Dynamic Conditional Correlation
Lee, Jiho ; Seong, Byeongchan ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 1013~1024
DOI : 10.5351/KJAS.2015.28.5.1013
We compare the performance of two representative estimation methods for the dynamic conditional correlation (DCC) GARCH model. The first method is the pairwise estimation which exploits partial information from the paired series, irrespective to the time series dimension. The second is the multi-dimensional estimation that uses full information of the time series. As a simulation for the comparison, we generate a multivariate time series similar to those observed in real markets and construct a DCC GARCH model. As an empirical example, we constitute various portfolios using real KOSPI 200 sector indices and estimate volatility and VaR of the portfolios. Through the estimated dynamic correlations from the simulation and the estimated volatility and value at risk (VaR) of the portfolios, we evaluate the performance of the estimations. We observe that the multi-dimensional estimation tends to be superior to pairwise estimation; in addition, relatively-uncorrelated series can improve the performance of the multi-dimensional estimation.
Effects of Parameter Estimation in Phase I on Phase II Control Limits for Monitoring Autocorrelated Data
Lee, Sungim ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 1025~1034
DOI : 10.5351/KJAS.2015.28.5.1025
Traditional Shewhart control charts assume that the observations are independent over time. Current progress in measurement and data collection technology lead to the presence of autocorrelated process data that may affect poor performance in statistical process control. One of the most popular charts for autocorrelated data is to model a correlative structure with an appropriate time series model and apply control chart to the sequence of residuals. Model parameters are estimated by an in-control Phase I reference sample since they are usually unknown in practice. This paper deals with the effects of parameter estimation on Phase II control limits to monitor autocorrelated data.
Effect of Experimental Layout on Model Selection under Variance Components Models: A Simulation Study
Lee, Yonghee ;
Korean Journal of Applied Statistics, volume 28, issue 5, 2015, Pages 1035~1046
DOI : 10.5351/KJAS.2015.28.5.1035
Variance components models incorporate various random factors in the form of linear models. There are two experimental Layouts for the classification of factors under variance components models: nested classification and crossed classification. We consider two-way variance components models and investigate the effect of experimental Layout on the performance of model selection criteria AIC and BIC. The effect of experimental Layout is studied through a simulation study with various combinations of parameters in a systematic fashion. The simulation study shows differences in performance of model selection methods between the two classification. There is a particular tendency to prefer the smaller model than the true model when the variance component of a nested factor becomes relatively larger than a nesting factor that is persistent even when the sample size is not small.