• Title/Summary/Keyword: variable selection

Search Result 870, Processing Time 0.024 seconds

A Study on the Selection of Dependent Variables of Momentum Equations in the General Curvilinear Coordinate System for Computational Fluid Dynamics (전산유체역학을 위한 일반 곡률좌표계에서 운동량 방정식의 종속변수 선정에 관한 연구)

  • Kim, Won-Kap;Choi, Young Don
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.23 no.2
    • /
    • pp.198-209
    • /
    • 1999
  • This study reports the selection of dependent variables for momentum equations in general curvilinear coordinates. Catesian, covariant and contravariant velocity components were examined for the dependent variable. The focus of present study is confined to staggered grid system Each dependent variable selected for momentum equations are tested for several flow fields. Results show that the selection of Cartesian and covariant velocity components intrinsically can not satisfy mass conservation of control volume unless additional converting processes ore used. Also, Cartesian component can only be used for the flow field in which main-flow direction does not change significantly. Convergence rate for the selection of covariant velocity component decreases quickly as with the increase of non-orthogonality of grid system. But the selection of contravariant velocity component reduces the total mass residual of discretized equations rapidly to the limit of machine accuracy and the solutions are insensitive to the main-flow direction.

A Study on Applying Shrinkage Method in Generalized Additive Model (일반화가법모형에서 축소방법의 적용연구)

  • Ki, Seung-Do;Kang, Kee-Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.1
    • /
    • pp.207-218
    • /
    • 2010
  • Generalized additive model(GAM) is the statistical model that resolves most of the problems existing in the traditional linear regression model. However, overfitting phenomenon can be aroused without applying any method to reduce the number of independent variables. Therefore, variable selection methods in generalized additive model are needed. Recently, Lasso related methods are popular for variable selection in regression analysis. In this research, we consider Group Lasso and Elastic net models for variable selection in GAM and propose an algorithm for finding solutions. We compare the proposed methods via Monte Carlo simulation and applying auto insurance data in the fiscal year 2005. lt is shown that the proposed methods result in the better performance.

Variable Selection in Normal Mixture Model Based Clustering under Heteroscedasticity (이분산 상황 하에서 정규혼합모형 기반 군집분석의 변수선택)

  • Kim, Seung-Gu
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1213-1224
    • /
    • 2011
  • In high dimensionality where the number of variables are excessively larger than observations, it is required to remove the noninformative variables to cluster observations. Most model-based approaches for variable selection have been considered under the assumption of homoscedasticity and their models are mainly estimated by a penalized likelihood method. In this paper, a different approach is proposed to remove the noninformative variables effectively and to cluster based on the modified normal mixture model simultaneously. The validity of the model was provided and an EM algorithm was derived to estimate the parameters. Simulation studies and an experiment using real microarray dataset showed the effectiveness of the proposed method.

Empirical Analysis of Relationship between Internet Communication Network Quality Characteristics and Customer Satisfaction using Regression Variable Selection Procedures (회귀변수 선택절차를 이용한 인터넷통신 네트워크 품질특성과 고객만족도의 관계 실증분석)

  • Park, Sung-Min;Park, Young-Joon
    • IE interfaces
    • /
    • v.18 no.3
    • /
    • pp.253-267
    • /
    • 2005
  • Customer satisfaction becomes one of the important managerial concerns associated with corporate competency in current competitive environment for Internet communication service companies. Hence, it is demanding to improve a company's customer satisfaction through the total quality management perspective. In practice, engineers as well as the management hope to find major quality characteristics with Internet communication network that is closely related to customer satisfaction, consequently aiming to the raise of their company's customer satisfaction. This paper presents an empirical relationship analysis between network quality characteristics and customer satisfaction on Internet communication. Methodologically, the relationship analysis framework is based on the regression variable selection procedures. In this framework, it is implemented that; 1) iterative model building; and 2) consistent criteria application to statistical tests for selecting significant variables. A case study shows that; 1) the customer satisfaction on the network connection seems to be more closely related to the network quality characteristics compared with the customer satisfaction on the network speed; and 2) the download disconnection rate has relatively evident relationship with the customer satisfaction on the network connection.

Variable Selection in PLS Regression with Penalty Function (벌점함수를 이용한 부분최소제곱 회귀모형에서의 변수선택)

  • Park, Chong-Sun;Moon, Guy-Jong
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.4
    • /
    • pp.633-642
    • /
    • 2008
  • Variable selection algorithm for partial least square regression using penalty function is proposed. We use the fact that usual partial least square regression problem can be expressed as a maximization problem with appropriate constraints and we will add penalty function to this maximization problem. Then simulated annealing algorithm can be used in searching for optimal solutions of above maximization problem with penalty functions added. The HARD penalty function would be suggested as the best in several aspects. Illustrations with real and simulated examples are provided.

An Analysis of Job Selection, Major-Job Match and Wage Level of College Graduates (대학 졸업생의 직업선택과 임금 수준)

  • Park, Jae-Min
    • Journal of Korea Technology Innovation Society
    • /
    • v.14 no.1
    • /
    • pp.22-39
    • /
    • 2011
  • This study examines the wage level from a viewpoint of major-job match as part of an analysis on the skill mismatch problem in 4-year college graduates. The empirical analysis explicitly incorporate the sample selection bias as an econometric problem not only suggested but merely introduced in the earlier studies. This study also set up a major-job match variable, which was usually handled as a binary variable for analytical convenience, as a polychotomous choice variable in selection equation as provided by the survey. In particular, it considered multi-cohort survey on graduates of the years 1982, 1992, and 2002 for the empirical analysis. As a result of empirical analysis, the wage premium of a major-job match was identified. This result was consistent after the consideration of a sample selection bias and also after modeling the major-job match variable as polychotomously selective. Through an analysis classified by the major, this study identified a relatively high wage premium among Social Science, Engineering, and Science majors. However, there was a difference in the effect of selection among these majors. Also, by assessing cohort effects this study found that the skill mismatch had rapidly progressed in 1992, while difference between 1992 and 2002 cohorts are insignificant. The analysis suggests that wage level is better understood within the context of both sample selection and major-job match, and regardless of model specification the major-job match affects wage strongly.

  • PDF

A small review and further studies on the LASSO

  • Kwon, Sunghoon;Han, Sangmi;Lee, Sangin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.5
    • /
    • pp.1077-1088
    • /
    • 2013
  • High-dimensional data analysis arises from almost all scientific areas, evolving with development of computing skills, and has encouraged penalized estimations that play important roles in statistical learning. For the past years, various penalized estimations have been developed, and the least absolute shrinkage and selection operator (LASSO) proposed by Tibshirani (1996) has shown outstanding ability, earning the first place on the development of penalized estimation. In this paper, we first introduce a number of recent advances in high-dimensional data analysis using the LASSO. The topics include various statistical problems such as variable selection and grouped or structured variable selection under sparse high-dimensional linear regression models. Several unsupervised learning methods including inverse covariance matrix estimation are presented. In addition, we address further studies on new applications which may establish a guideline on how to use the LASSO for statistical challenges of high-dimensional data analysis.

A Study on Marriage Types and Courtship - focused on working women - (결혼유형에 따른 배우자 선택 과정의 차이에 관한 연구 - 취업 여성을 중심으로 -)

  • Kim Jin-Hee;Kim Yang-Hee
    • Journal of the Korean Home Economics Association
    • /
    • v.37 no.12 s.142
    • /
    • pp.13-28
    • /
    • 1999
  • This study aimed to analyze the process of courtship form having date and marriage toward women who had job before marriage. The objects were 27.36 years old and had 9.59 months marital life at average. This study conducted structured questionnaires using the reflection of spouse selection procedure. By using collected data through questionnaires, it conducted descriptive statistics, cluster analysis and t-test. The type of marriage would be divided into the emotional marriage group and the implemental marriage group The emotional marriage group had longer dating time and more satisfaction of spouse selection than the implemental group. On value variable, the emotional marriage group had more subjective selection standard and more expectation social and emotional benefit than the latter group. On search variable, the emotional marriage group estimated the relation stability and satisfied the relation with spouse more than the implemental marriage group and expected the less possibility to meet new partner.

  • PDF

A Bayesian Variable Selection Method for Binary Response Probit Regression

  • Kim, Hea-Jung
    • Journal of the Korean Statistical Society
    • /
    • v.28 no.2
    • /
    • pp.167-182
    • /
    • 1999
  • This article is concerned with the selection of subsets of predictor variables to be included in building the binary response probit regression model. It is based on a Bayesian approach, intended to propose and develop a procedure that uses probabilistic considerations for selecting promising subsets. This procedure reformulates the probit regression setup in a hierarchical normal mixture model by introducing a set of hyperparameters that will be used to identify subset choices. The appropriate posterior probability of each subset of predictor variables is obtained through the Gibbs sampler, which samples indirectly from the multinomial posterior distribution on the set of possible subset choices. Thus, in this procedure, the most promising subset of predictors can be identified as the one with highest posterior probability. To highlight the merit of this procedure a couple of illustrative numerical examples are given.

  • PDF

Selection of markers in the framework of multivariate receiver operating characteristic curve analysis in binary classification

  • Sameera, G;Vishnu, Vardhan R
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.2
    • /
    • pp.79-89
    • /
    • 2019
  • Classification models pertaining to receiver operating characteristic (ROC) curve analysis have been extended from univariate to multivariate setup by linearly combining available multiple markers. One such classification model is the multivariate ROC curve analysis. However, not all markers contribute in a real scenario and may mask the contribution of other markers in classifying the individuals/objects. This paper addresses this issue by developing an algorithm that helps in identifying the important markers that are significant and true contributors. The proposed variable selection framework is supported by real datasets and a simulation study, it is shown to provide insight about the individual marker's significance in providing a classifier rule/linear combination with good extent of classification.