Search | Korea Science

Ordinal Variable Selection in Decision Trees (의사결정나무에서 순서형 분리변수 선택에 관한 연구)

Kim Hyun-Joong
- The Korean Journal of Applied Statistics
- /
- v.19 no.1
- /
- pp.149-161
- /
- 2006
The most important component in decision tree algorithm is the rule for split variable selection. Many earlier algorithms such as CART and C4.5 use greedy search algorithm for variable selection. Recently, many methods were developed to cope with the weakness of greedy search algorithm. Most algorithms have different selection criteria depending on the type of variables: continuous or nominal. However, ordinal type variables are usually treated as continuous ones. This approach did not cause any trouble for the methods using greedy search algorithm. However, it may cause problems for the newer algorithms because they use statistical methods valid for continuous or nominal types only. In this paper, we propose a ordinal variable selection method that uses Cramer-von Mises testing procedure. We performed comparisons among CART, C4.5, QUEST, CRUISE, and the new method. It was shown that the new method has a good variable selection power for ordinal type variables.
https://doi.org/10.5351/KJAS.2006.19.1.149 인용 PDF KSCI

Robust varying coefficient model using L1 regularization

Hwang, Changha;Bae, Jongsik;Shim, Jooyong
- Journal of the Korean Data and Information Science Society
- /
- v.27 no.4
- /
- pp.1059-1066
- /
- 2016
In this paper we propose a robust version of varying coefficient models, which is based on the regularized regression with L1 regularization. We use the iteratively reweighted least squares procedure to solve L1 regularized objective function of varying coefficient model in locally weighted regression form. It provides the efficient computation of coefficient function estimates and the variable selection for given value of smoothing variable. We present the generalized cross validation function and Akaike information type criterion for the model selection. Applications of the proposed model are illustrated through the artificial examples and the real example of predicting the effect of the input variables and the smoothing variable on the output.
https://doi.org/10.7465/jkdi.2016.27.4.1059 인용 PDF KSCI

Variable selection in partial linear regression using the least angle regression (부분선형모형에서 LARS를 이용한 변수선택)

Seo, Han Son;Yoon, Min;Lee, Hakbae
- The Korean Journal of Applied Statistics
- /
- v.34 no.6
- /
- pp.937-944
- /
- 2021
The problem of selecting variables is addressed in partial linear regression. Model selection for partial linear models is not easy since it involves nonparametric estimation such as smoothing parameter selection and estimation for linear explanatory variables. In this work, several approaches for variable selection are proposed using a fast forward selection algorithm, least angle regression (LARS). The proposed procedures use t-test, all possible regressions comparisons or stepwise selection process with variables selected by LARS. An example based on real data and a simulation study on the performance of the suggested procedures are presented.
https://doi.org/10.5351/KJAS.2021.34.6.937 인용 PDF KSCI

COMPARISON OF VARIABLE SELECTION AND STRUCTURAL SPECIFICATION BETWEEN REGRESSION AND NEURAL NETWORK MODELS FOR HOUSEHOLD VEHICULAR TRIP FORECASTING

Yi, Jun-Sub
- Journal of applied mathematics & informatics
- /
- v.6 no.2
- /
- pp.599-609
- /
- 1999
Neural networks are explored as an alternative to a regres-sion model for prediction of the number of daily household vehicular trips. This study focuses on contrasting a neural network model with a regression model in term of variable selection as well as the appli-cation of these models for prediction of extreme observations, The differences in the models regarding data transformation variable selec-tion and multicollinearity are considered. The results indicate that the neural network model is a viable alternative to the regression model for addressing both messy data problems and limitation in variable structure specification.

Variable selection with quantile regression tree (분위수 회귀나무를 이용한 변수선택 방법 연구)

Chang, Youngjae
- The Korean Journal of Applied Statistics
- /
- v.29 no.6
- /
- pp.1095-1106
- /
- 2016
The quantile regression method proposed by Koenker et al. (1978) focuses on conditional quantiles given by independent variables, and analyzes the relationship between response variable and independent variables at the given quantile. Considering the linear programming used for the estimation of quantile regression coefficients, the model fitting job might be difficult when large data are introduced for analysis. Therefore, dimension reduction (or variable selection) could be a good solution for the quantile regression of large data sets. Regression tree methods are applied to a variable selection for quantile regression in this paper. Real data of Korea Baseball Organization (KBO) players are analyzed following the variable selection approach based on the regression tree. Analysis result shows that a few important variables are selected, which are also meaningful for the given quantiles of salary data of the baseball players.
https://doi.org/10.5351/KJAS.2016.29.6.1095 인용 PDF KSCI

Variable Selection in Clustering by Recursive Fit of Normal Distribution-based Salient Mixture Model (정규분포기반 두각 혼합모형의 순환적 적합을 이용한 군집분석에서의 변수선택)

Kim, Seung-Gu
- The Korean Journal of Applied Statistics
- /
- v.26 no.5
- /
- pp.821-834
- /
- 2013
Law et al. (2004) proposed a normal distribution based salient mixture model for variable selection in clustering. However, this model has substantial problems such as the unidentifiability of components an the inaccurate selection of informative variables in the case of a small cluster size. We propose an alternative method to overcome problems and demonstrate a good performance through experiments on simulated data and real data.
https://doi.org/10.5351/KJAS.2013.26.5.821 인용 PDF KSCI

Variable selection in L1 penalized censored regression

Hwang, Chang-Ha;Kim, Mal-Suk;Shi, Joo-Yong
- Journal of the Korean Data and Information Science Society
- /
- v.22 no.5
- /
- pp.951-959
- /
- 2011
The proposed method is based on a penalized censored regression model with L1-penalty. We use the iteratively reweighted least squares procedure to solve L1 penalized log likelihood function of censored regression model. It provide the efficient computation of regression parameters including variable selection and leads to the generalized cross validation function for the model selection. Numerical results are then presented to indicate the performance of the proposed method.
PDF KSCI

Two-Stage Penalized Composite Quantile Regression with Grouped Variables

Bang, Sungwan;Jhun, Myoungshic
- Communications for Statistical Applications and Methods
- /
- v.20 no.4
- /
- pp.259-270
- /
- 2013
This paper considers a penalized composite quantile regression (CQR) that performs a variable selection in the linear model with grouped variables. An adaptive sup-norm penalized CQR (ASCQR) is proposed to select variables in a grouped manner; in addition, the consistency and oracle property of the resulting estimator are also derived under some regularity conditions. To improve the efficiency of estimation and variable selection, this paper suggests the two-stage penalized CQR (TSCQR), which uses the ASCQR to select relevant groups in the first stage and the adaptive lasso penalized CQR to select important variables in the second stage. Simulation studies are conducted to illustrate the finite sample performance of the proposed methods.
https://doi.org/10.5351/CSAM.2013.20.4.259 인용 PDF KSCI

Penalized rank regression estimator with the smoothly clipped absolute deviation function

Park, Jong-Tae;Jung, Kang-Mo
- Communications for Statistical Applications and Methods
- /
- v.24 no.6
- /
- pp.673-683
- /
- 2017
The least absolute shrinkage and selection operator (LASSO) has been a popular regression estimator with simultaneous variable selection. However, LASSO does not have the oracle property and its robust version is needed in the case of heavy-tailed errors or serious outliers. We propose a robust penalized regression estimator which provide a simultaneous variable selection and estimator. It is based on the rank regression and the non-convex penalty function, the smoothly clipped absolute deviation (SCAD) function which has the oracle property. The proposed method combines the robustness of the rank regression and the oracle property of the SCAD penalty. We develop an efficient algorithm to compute the proposed estimator that includes a SCAD estimate based on the local linear approximation and the tuning parameter of the penalty function. Our estimate can be obtained by the least absolute deviation method. We used an optimal tuning parameter based on the Bayesian information criterion and the cross validation method. Numerical simulation shows that the proposed estimator is robust and effective to analyze contaminated data.
https://doi.org/10.29220/CSAM.2017.24.6.673 인용 PDF KSCI

Penalized variable selection for accelerated failure time models

Park, Eunyoung;Ha, Il Do
- Communications for Statistical Applications and Methods
- /
- v.25 no.6
- /
- pp.591-604
- /
- 2018
The accelerated failure time (AFT) model is a linear model under the log-transformation of survival time that has been introduced as a useful alternative to the proportional hazards (PH) model. In this paper we propose variable-selection procedures of fixed effects in a parametric AFT model using penalized likelihood approaches. We use three popular penalty functions, least absolute shrinkage and selection operator (LASSO), adaptive LASSO and smoothly clipped absolute deviation (SCAD). With these procedures we can select important variables and estimate the fixed effects at the same time. The performance of the proposed method is evaluated using simulation studies, including the investigation of impact of misspecifying the assumed distribution. The proposed method is illustrated with a primary biliary cirrhosis (PBC) data set.
https://doi.org/10.29220/CSAM.2018.25.6.591 인용 PDF KSCI

Search Result 873, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)