Search | Korea Science

Bias Reduction in Split Variable Selection in C4.5

Shin, Sung-Chul;Jeong, Yeon-Joo;Song, Moon Sup
- Communications for Statistical Applications and Methods
- /
- v.10 no.3
- /
- pp.627-635
- /
- 2003
In this short communication we discuss the bias problem of C4.5 in split variable selection and suggest a method to reduce the variable selection bias among categorical predictor variables. A penalty proportional to the number of categories is applied to the splitting criterion gain of C4.5. The results of empirical comparisons show that the proposed modification of C4.5 reduces the size of classification trees.
https://doi.org/10.5351/CKSS.2003.10.3.627 인용 PDF KSCI

Variable Selection Theorems in General Linear Model

Park, Jeong-Soo;Yoon, Sang-Hoo
- 한국데이터정보과학회:학술대회논문집
- /
- 2006.04a
- /
- pp.171-179
- /
- 2006
For the problem of variable selection in linear models, we consider the errors are correlated with V covariance matrix. Hocking's theorems on the effects of the overfitting and the underfitting in linear model are extended to the less than full rank and correlated error model, and to the ANCOVA model.
PDF

Variable Selection Theorems in General Linear Model

Yoon, Sang-Hoo;Park, Jeong-Soo
- Proceedings of the Korean Statistical Society Conference
- /
- 2005.11a
- /
- pp.187-192
- /
- 2005
For the problem of variable selection in linear models, we consider the errors are correlated with V covariance matrix. Hocking's theorems on the effects of the overfitting and the undefitting in linear model are extended to the less than full rank and correlated error model, and to the ANCOVA model
PDF

Bayesian Parameter :Estimation and Variable Selection in Random Effects Generalised Linear Models for Count Data

Oh, Man-Suk;Park, Tae-Sung
- Journal of the Korean Statistical Society
- /
- v.31 no.1
- /
- pp.93-107
- /
- 2002
Random effects generalised linear models are useful for analysing clustered count data in which responses are usually correlated. We propose a Bayesian approach to parameter estimation and variable selection in random effects generalised linear models for count data. A simple Gibbs sampling algorithm for parameter estimation is presented and a simple and efficient variable selection is done by using the Gibbs outputs. An illustrative example is provided.
PDF KSCI

Variable Selection with Nonconcave Penalty Function on Reduced-Rank Regression

Jung, Sang Yong;Park, Chongsun
- Communications for Statistical Applications and Methods
- /
- v.22 no.1
- /
- pp.41-54
- /
- 2015
In this article, we propose nonconcave penalties on a reduced-rank regression model to select variables and estimate coefficients simultaneously. We apply HARD (hard thresholding) and SCAD (smoothly clipped absolute deviation) symmetric penalty functions with singularities at the origin, and bounded by a constant to reduce bias. In our simulation study and real data analysis, the new method is compared with an existing variable selection method using $L_1$ penalty that exhibits competitive performance in prediction and variable selection. Instead of using only one type of penalty function, we use two or three penalty functions simultaneously and take advantages of various types of penalty functions together to select relevant predictors and estimation to improve the overall performance of model fitting.
https://doi.org/10.5351/CSAM.2015.22.1.041 인용 PDF KSCI

Variable Selection with Regression Trees

Chang, Young-Jae
- The Korean Journal of Applied Statistics
- /
- v.23 no.2
- /
- pp.357-366
- /
- 2010
Many tree algorithms have been developed for regression problems. Although they are regarded as good algorithms, most of them suffer from loss of prediction accuracy when there are many noise variables. To handle this problem, we propose the multi-step GUIDE, which is a regression tree algorithm with a variable selection process. The multi-step GUIDE performs better than some of the well-known algorithms such as Random Forest and MARS. The results based on simulation study shows that the multi-step GUIDE outperforms other algorithms in terms of variable selection and prediction accuracy. It generally selects the important variables correctly with relatively few noise variables and eventually gives good prediction accuracy.
https://doi.org/10.5351/KJAS.2010.23.2.357 인용 PDF KSCI

A convenient approach for penalty parameter selection in robust lasso regression

Kim, Jongyoung;Lee, Seokho
- Communications for Statistical Applications and Methods
- /
- v.24 no.6
- /
- pp.651-662
- /
- 2017
We propose an alternative procedure to select penalty parameter in $L_1$ penalized robust regression. This procedure is based on marginalization of prior distribution over the penalty parameter. Thus, resulting objective function does not include the penalty parameter due to marginalizing it out. In addition, its estimating algorithm automatically chooses a penalty parameter using the previous estimate of regression coefficients. The proposed approach bypasses cross validation as well as saves computing time. Variable-wise penalization also performs best in prediction and variable selection perspectives. Numerical studies using simulation data demonstrate the performance of our proposals. The proposed methods are applied to Boston housing data. Through simulation study and real data application we demonstrate that our proposals are competitive to or much better than cross-validation in prediction, variable selection, and computing time perspectives.
https://doi.org/10.29220/CSAM.2017.24.6.651 인용 PDF KSCI

Efficient estimation and variable selection for partially linear single-index-coefficient regression models

Kim, Young-Ju
- Communications for Statistical Applications and Methods
- /
- v.26 no.1
- /
- pp.69-78
- /
- 2019
A structured model with both single-index and varying coefficients is a powerful tool in modeling high dimensional data. It has been widely used because the single-index can overcome the curse of dimensionality and varying coefficients can allow nonlinear interaction effects in the model. For high dimensional index vectors, variable selection becomes an important question in the model building process. In this paper, we propose an efficient estimation and a variable selection method based on a smoothing spline approach in a partially linear single-index-coefficient regression model. We also propose an efficient algorithm for simultaneously estimating the coefficient functions in a data-adaptive lower-dimensional approximation space and selecting significant variables in the index with the adaptive LASSO penalty. The empirical performance of the proposed method is illustrated with simulated and real data examples.
https://doi.org/10.29220/CSAM.2019.26.1.069 인용 PDF KSCI

Forecasting the Baltic Dry Index Using Bayesian Variable Selection (베이지안 변수선택 기법을 이용한 발틱건화물운임지수(BDI) 예측)

Xiang-Yu Han;Young Min Kim
- Korea Trade Review
- /
- v.47 no.5
- /
- pp.21-37
- /
- 2022
Baltic Dry Index (BDI) is difficult to forecast because of the high volatility and complexity. To improve the BDI forecasting ability, this study apply Bayesian variable selection method with a large number of predictors. Our estimation results based on the BDI and all predictors from January 2000 to September 2021 indicate that the out-of-sample prediction ability of the ADL model with the variable selection is superior to that of the AR model in terms of point and density forecasting. We also find that critical predictors for the BDI change over forecasts horizon. The lagged BDI are being selected as an key predictor at all forecasts horizon, but commodity price, the clarksea index, and interest rates have additional information to predict BDI at mid-term horizon. This implies that time variations of predictors should be considered to predict the BDI.
https://doi.org/10.22659/KTRA.2022.47.5.21 인용 PDF

Fast Frame Selection Method for Multi-Reference and Variable Block Motion Estimation (다중참조 및 가변블록 움직임 추정을 위한 고속 참조영상 선택 방법)

Kim, Sung-Dae;SunWoo, Myung-Hoon
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.45 no.6
- /
- pp.1-8
- /
- 2008
This paper introduces three efficient frame selection schemes to reduce the computation complexity for the multi-reference and variable block size Motion Estimation (ME). The proposed RSP (Reference Selection Pass) scheme can minimize the overhead of frame selection. The MFS (Modified Frame Selection) scheme can reduce the number of search points about 18% compared with existing schemes considering the motion of image during the reference frame selection process. In addition, the TPRFS (Two Pass Reference frame Selection) scheme can minimize the frame selection operation for the variable block size ME in H.264/AVC using the character of selected reference frame according to the block size. The simulation results show the proposed schemes can save up to 50% of the ME computation without degradation of image Qualify. Because the proposed schemes can be separated from the block matching process, they can be used with any existing single reference fast search algorithms.
PDF KSCI

Search Result 873, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)