• Title/Summary/Keyword: variable selection

Search Result 874, Processing Time 0.033 seconds

The correlation and regression analyses based on variable selection for the university evaluation index (대학 평가지표들에 대한 상관분석과 변수선택에 의한 선형모형추정)

  • Song, Pil-Jun;Kim, Jong-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.3
    • /
    • pp.457-465
    • /
    • 2012
  • The purpose of this study is to analyze the association between indicators and to find statistical models based on important indicators at 'College Notifier' in Korea Council for University Education. First, Pearson correlation coefficients are used to find statistically significant correlations. By variable selection method, the important indicators are selected and their coefficients are estimated. As variable selection method, backward and stepwise methods are employed.

Analysis of multi-center bladder cancer survival data using variable-selection method of multi-level frailty models (다수준 프레일티모형 변수선택법을 이용한 다기관 방광암 생존자료분석)

  • Kim, Bohyeon;Ha, Il Do;Lee, Donghwan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.2
    • /
    • pp.499-510
    • /
    • 2016
  • It is very important to select relevant variables in regression models for survival analysis. In this paper, we introduce a penalized variable-selection procedure in multi-level frailty models based on the "frailtyHL" R package (Ha et al., 2012). Here, the estimation procedure of models is based on the penalized hierarchical likelihood, and three penalty functions (LASSO, SCAD and HL) are considered. The proposed methods are illustrated with multi-country/multi-center bladder cancer survival data from the EORTC in Belgium. We compare the results of three variable-selection methods and discuss their advantages and disadvantages. In particular, the results of data analysis showed that the SCAD and HL methods select well important variables than in the LASSO method.

A comparison study of Bayesian variable selection methods for sparse covariance matrices (희박 공분산 행렬에 대한 베이지안 변수 선택 방법론 비교 연구)

  • Kim, Bongsu;Lee, Kyoungjae
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.2
    • /
    • pp.285-298
    • /
    • 2022
  • Continuous shrinkage priors, as well as spike and slab priors, have been widely employed for Bayesian inference about sparse regression coefficient vectors or covariance matrices. Continuous shrinkage priors provide computational advantages over spike and slab priors since their model space is substantially smaller. This is especially true in high-dimensional settings. However, variable selection based on continuous shrinkage priors is not straightforward because they do not give exactly zero values. Although few variable selection approaches based on continuous shrinkage priors have been proposed, no substantial comparative investigations of their performance have been conducted. In this paper, We compare two variable selection methods: a credible interval method and the sequential 2-means algorithm (Li and Pati, 2017). Various simulation scenarios are used to demonstrate the practical performances of the methods. We conclude the paper by presenting some observations and conjectures based on the simulation findings.

Variable Selection for Logistic Regression Model Using Adjusted Coefficients of Determination (수정 결정계수를 사용한 로지스틱 회귀모형에서의 변수선택법)

  • Hong C. S.;Ham J. H.;Kim H. I.
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.2
    • /
    • pp.435-443
    • /
    • 2005
  • Coefficients of determination in logistic regression analysis are defined as various statistics, and their values are relatively smaller than those for linear regression model. These coefficients of determination are not generally used to evaluate and diagnose logistic regression model. Liao and McGee (2003) proposed two adjusted coefficients of determination which are robust at the addition of inappropriate predictors and the variation of sample size. In this work, these adjusted coefficients of determination are applied to variable selection method for logistic regression model and compared with results of other methods such as the forward selection, backward elimination, stepwise selection, and AIC statistic.

An Application of the Clustering Threshold Gradient Descent Regularization Method for Selecting Genes in Predicting the Survival Time of Lung Carcinomas

  • Lee, Seung-Yeoun;Kim, Young-Chul
    • Genomics & Informatics
    • /
    • v.5 no.3
    • /
    • pp.95-101
    • /
    • 2007
  • In this paper, we consider the variable selection methods in the Cox model when a large number of gene expression levels are involved with survival time. Deciding which genes are associated with survival time has been a challenging problem because of the large number of genes and relatively small sample size (n<

Analysis Influential Factors for Media Selection in Banking Transaction Context (온.오프라인 은행거래를 위한 매체선택 영향 요인)

  • Cho, Nam-Jae;Park, Ki-Ho;Lim, Hae-Kyung
    • Journal of Digital Convergence
    • /
    • v.6 no.3
    • /
    • pp.75-84
    • /
    • 2008
  • The purpose of our this research, based on the Media Selection Theory, the Technology Acceptance Model, and the Social Influence Theory, is to investigate the influential factors that affect media selection in banking transactions. Analyses showed that for location sensitive bank window's and ATMs (automatic teller machines), defined as offline-based transaction channels, convenience was the variable affecting media selection. However, in the case of online media not related to location, (phone banking, internet banking, and mobile banking) reliability was the significant variable influencing use. The findings show that banking organizations may benefit from identifying traits of media affecting use, and should differentiate customer services for competitive advantage.

  • PDF

Major Criteria for Channel Selection in Banking Transaction

  • Cho, Nam-Jae;Park, Ki-Ho
    • Journal of Information Technology Applications and Management
    • /
    • v.16 no.1
    • /
    • pp.169-183
    • /
    • 2009
  • The purpose of this research, based on the Media Selection Theory, the Technology Acceptance Model, and the Social Influence Theory, is to investigate the influential factors that affect media selection in banking transactions. Analyses showed that for location sensitive bank windows and ATMs(automatic teller machines), defined as offline-based transaction channels, convenience was the variable affecting media selection. However, in the case of online media not related to location, (phone banking, internet banking, and mobile banking) reliability was the significant variable influencing use. The findings show that banking organizations may benefit from identifying traits of media affecting use, and should differentiate customer services for competitive advantage.

  • PDF

A Bayesian Method for Narrowing the Scope fo Variable Selection in Binary Response t-Link Regression

  • Kim, Hea-Jung
    • Journal of the Korean Statistical Society
    • /
    • v.29 no.4
    • /
    • pp.407-422
    • /
    • 2000
  • This article is concerned with the selecting predictor variables to be included in building a class of binary response t-link regression models where both probit and logistic regression models can e approximately taken as members of the class. It is based on a modification of the stochastic search variable selection method(SSVS), intended to propose and develop a Bayesian procedure that used probabilistic considerations for selecting promising subsets of predictor variables. The procedure reformulates the binary response t-link regression setup in a hierarchical truncated normal mixture model by introducing a set of hyperparameters that will be used to identify subset choices. In this setup, the most promising subset of predictors can be identified as that with highest posterior probability in the marginal posterior distribution of the hyperparameters. To highlight the merit of the procedure, an illustrative numerical example is given.

  • PDF

Optimal Variable Selection in a Thermal Error Model for Real Time Error Compensation (실시간 오차 보정을 위한 열변형 오차 모델의 최적 변수 선택)

  • Hwang, Seok-Hyun;Lee, Jin-Hyeon;Yang, Seung-Han
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.16 no.3 s.96
    • /
    • pp.215-221
    • /
    • 1999
  • The object of the thermal error compensation system in machine tools is improving the accuracy of a machine tool through real time error compensation. The accuracy of the machine tool totally depends on the accuracy of thermal error model. A thermal error model can be obtained by appropriate combination of temperature variables. The proposed method for optimal variable selection in the thermal error model is based on correlation grouping and successive regression analysis. Collinearity matter is improved with the correlation grouping and the judgment function which minimizes residual mean square is used. The linear model is more robust against measurement noises than an engineering judgement model that includes the higher order terms of variables. The proposed method is more effective for the applications in real time error compensation because of the reduction in computational time, sufficient model accuracy, and the robustness.

  • PDF

A two-step approach for variable selection in linear regression with measurement error

  • Song, Jiyeon;Shin, Seung Jun
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.1
    • /
    • pp.47-55
    • /
    • 2019
  • It is important to identify informative variables in high dimensional data analysis; however, it becomes a challenging task when covariates are contaminated by measurement error due to the bias induced by measurement error. In this article, we present a two-step approach for variable selection in the presence of measurement error. In the first step, we directly select important variables from the contaminated covariates as if there is no measurement error. We then apply, in the following step, orthogonal regression to obtain the unbiased estimates of regression coefficients identified in the previous step. In addition, we propose a modification of the two-step approach to further enhance the variable selection performance. Various simulation studies demonstrate the promising performance of the proposed method.