• Title/Summary/Keyword: explanatory variable

Search Result 425, Processing Time 0.03 seconds

Comments on the regression coefficients (다중회귀에서 회귀계수 추정량의 특성)

  • Kahng, Myung-Wook
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.4
    • /
    • pp.589-597
    • /
    • 2021
  • In simple and multiple regression, there is a difference in the meaning of regression coefficients, and not only are the estimates of regression coefficients different, but they also have different signs. Understanding the relative contribution of explanatory variables in a regression model is an important part of regression analysis. In a standardized regression model, the regression coefficient can be interpreted as the change in the response variable with respect to the standard deviation when the explanatory variable increases by the standard deviation in a situation where the values of the explanatory variables other than the corresponding explanatory variable are fixed. However, the size of the standardized regression coefficient is not a proper measure of the relative importance of each explanatory variable. In this paper, the estimator of the regression coefficient in multiple regression is expressed as a function of the correlation coefficient and the coefficient of determination. Furthermore, it is considered in terms of the effect of an additional explanatory variable and additional increase in the coefficient of determination. We also explore the relationship between estimates of regression coefficients and correlation coefficients in various plots. These results are specifically applied when there are two explanatory variables.

Two Diagnostic Plots in Constrained Regression

  • Kim, Myung-Geun
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.3
    • /
    • pp.495-500
    • /
    • 2009
  • Two diagnostic plots, added variable plot and partial residual plot, are proposed when a new explanatory variable is linearly added to constrained regressions. They are useful for investigating the effect of adding an explanatory variable to the constrained regression. They visually give an overall impression of the strength of linear relationship between response variable and added variable. A numerical example is provided for illustration.

Biplots of Multivariate Data Guided by Linear and/or Logistic Regression

  • Huh, Myung-Hoe;Lee, Yonggoo
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.2
    • /
    • pp.129-136
    • /
    • 2013
  • Linear regression is the most basic statistical model for exploring the relationship between a numerical response variable and several explanatory variables. Logistic regression secures the role of linear regression for the dichotomous response variable. In this paper, we propose a biplot-type display of the multivariate data guided by the linear regression and/or the logistic regression. The figures show the directional flow of the response variable as well as the interrelationship of explanatory variables.

A Study on Sex Role Identity and Makeup Behavior (여대생(女大生)의 성역할(性役割) 정체감(正體感)과 화장(化粧) 행동(行動)에 관(關)한 연구(硏究))

  • Kuh, Ja-Myung;Lee, Kwuy-Young
    • Journal of Fashion Business
    • /
    • v.6 no.2
    • /
    • pp.124-136
    • /
    • 2002
  • This objective study were to classify the contents of makeup behavior, to investigate the relationship between makeup behavior and sex role identity, and to examine how the makeup behavior, makeup satisfaction was influenced by sex role identity and demographics. To achieve this, the researchers surveyed 162 women for the ages of 18 through 25. The result of this study are followed. 1) Four factor of makeup behavior were sexual attractiveness, aesthetic, psychological dependence and makeup interest. 2) There were significant positive relationship between makeup behavior and sex role identity. 3) Sexual attractiveness were influenced by femininity, income. The explanatory power of the 2 variables were 8.5%. Aesthetic were influenced by masculinity. The explanatory power of the 1 variable was 9.2%. Psychological dependence were influenced by femininity. The explanatory power of the 1 variable was 8.2%. Makeup interest were influenced by masculinity, age. The explanatory power of the 2 variables were 9.0%. 4 Makeup satisfaction were influenced by sexual attractiveness, aesthetic. The explanatory power of the 2 variables were 22.1%.

A Study on the Design of Tolerance for Process Parameter using Decision Tree and Loss Function (의사결정나무와 손실함수를 이용한 공정파라미터 허용차 설계에 관한 연구)

  • Kim, Yong-Jun;Chung, Young-Bae
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.39 no.1
    • /
    • pp.123-129
    • /
    • 2016
  • In the manufacturing industry fields, thousands of quality characteristics are measured in a day because the systems of process have been automated through the development of computer and improvement of techniques. Also, the process has been monitored in database in real time. Particularly, the data in the design step of the process have contributed to the product that customers have required through getting useful information from the data and reflecting them to the design of product. In this study, first, characteristics and variables affecting to them in the data of the design step of the process were analyzed by decision tree to find out the relation between explanatory and target variables. Second, the tolerance of continuous variables influencing on the target variable primarily was shown by the application of algorithm of decision tree, C4.5. Finally, the target variable, loss, was calculated by a loss function of Taguchi and analyzed. In this paper, the general method that the value of continuous explanatory variables has been used intactly not to be transformed to the discrete value and new method that the value of continuous explanatory variables was divided into 3 categories were compared. As a result, first, the tolerance obtained from the new method was more effective in decreasing the target variable, loss, than general method. In addition, the tolerance levels for the continuous explanatory variables to be chosen of the major variables were calculated. In further research, a systematic method using decision tree of data mining needs to be developed in order to categorize continuous variables under various scenarios of loss function.

Graphical Method for Multiple Regression Model (다중회귀모형의 그래픽적 방법)

  • Lee, W.R.;Lee, U.K.;Hong, C.S.
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.1
    • /
    • pp.195-204
    • /
    • 2007
  • In order to represent multiple regression data, an alternative graphical method, called as SSR Plot, is proposed by using geometrical description methods. This plot uses the relation that the sum of sqaures for regression (SSR) of two explanatory variables is known as the sum of the SSR of one variable and the increase in the SSR due to the addition of other variable to the model that already contains a variable. This half circle shaped SSR plot contains vectors corresponding explanatory variables. We might conclude that some explanatory variables corresponding to vectors which locate near the horisontal axis do affect the response variable. Also, for the regression model with two explanatory variables, a magnitude of the angle between two vectors can be identified for suppression.

Tolerance Computation for Process Parameter Considering Loss Cost : In Case of the Larger is better Characteristics (손실 비용을 고려한 공정 파라미터 허용차 산출 : 망대 특성치의 경우)

  • Kim, Yong-Jun;Kim, Geun-Sik;Park, Hyung-Geun
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.2
    • /
    • pp.129-136
    • /
    • 2017
  • Among the information technology and automation that have rapidly developed in the manufacturing industries recently, tens of thousands of quality variables are estimated and categorized in database every day. The former existing statistical methods, or variable selection and interpretation by experts, place limits on proper judgment. Accordingly, various data mining methods, including decision tree analysis, have been developed in recent years. Cart and C5.0 are representative algorithms for decision tree analysis, but these algorithms have limits in defining the tolerance of continuous explanatory variables. Also, target variables are restricted by the information that indicates only the quality of the products like the rate of defective products. Therefore it is essential to develop an algorithm that improves upon Cart and C5.0 and allows access to new quality information such as loss cost. In this study, a new algorithm was developed not only to find the major variables which minimize the target variable, loss cost, but also to overcome the limits of Cart and C5.0. The new algorithm is one that defines tolerance of variables systematically by adopting 3 categories of the continuous explanatory variables. The characteristics of larger-the-better was presumed in the environment of programming R to compare the performance among the new algorithm and existing ones, and 10 simulations were performed with 1,000 data sets for each variable. The performance of the new algorithm was verified through a mean test of loss cost. As a result of the verification show, the new algorithm found that the tolerance of continuous explanatory variables lowered loss cost more than existing ones in the larger is better characteristics. In a conclusion, the new algorithm could be used to find the tolerance of continuous explanatory variables to minimize the loss in the process taking into account the loss cost of the products.

Methodology for Determining Functional Forms in Developing Statistical Collision Models (교통사고모형 개발에서의 함수식 도출 방법론에 관한 연구)

  • Baek, Jong-Dae;Hummer, Joseph
    • International Journal of Highway Engineering
    • /
    • v.14 no.5
    • /
    • pp.189-199
    • /
    • 2012
  • PURPOSES: The purpose of this study is to propose a new methodology for developing statistical collision models and to show the validation results of the methodology. METHODS: A new modeling method of introducing variables into the model one by one in a multiplicative form is suggested. A method for choosing explanatory variables to be introduced into the model is explained. A method for determining functional forms for each explanatory variable is introduced as well as a parameter estimating procedure. A model selection method is also dealt with. Finally, the validation results is provided to demonstrate the efficacy of the final models developed using the method suggested in this study. RESULTS: According to the results of the validation for the total and injury collisions, the predictive powers of the models developed using the method suggested in this study were better than those of generalized linear models for the same data. CONCLUSIONS: Using the methodology suggested in this study, we could develop better statistical collision models having better predictive powers. This was because the methodology enabled us to find the relationships between dependant variable and each explanatory variable individually and to find the functional forms for the relationships which can be more likely non-linear.

An educational tool for regression models with dummy variables using Excel VBA (엑셀 VBA을 이용한 가변수 회귀모형 교육도구 개발)

  • Choi, Hyun Seok;Park, Cheolyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.3
    • /
    • pp.593-601
    • /
    • 2013
  • We often need to include categorial variables as explanatory variables in regression models. The categorial variables in regression models can be quantified through dummy variables. In this study, we provide an education tool using Excel VBA for displaying regression lines along with test results for regression models with a continuous explanatory variable and one or two categorical explanatory variables. The regression lines with test results are provided step by step for the model(s) with interaction(s), the model(s) without interaction(s) but with dummy variables, and the model without dummy variable(s). With this tool, we can easily understand the meaning of dummy variables and interaction effect through graphics and further decide which model is more suited to the data on hand.

Linear profile monitoring with random covariate (설명변수가 랜덤인 성형 프로파일 연구)

  • Kim, Daeun;Lee, Sungim;Lim, Johan
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.3
    • /
    • pp.335-346
    • /
    • 2022
  • Profile control chart aims to detect a change in the functional relationship of multivariate characteristics in the statistical process control. In monitoring two variables, a linear profile is of interest composed of the intercept and slope of one variable (response variable) against the other (explanatory variable). The previous studies on monitoring of the linear profile mostly assume that the explanatory variables are the same for all profiles. However, there are also cases where they vary depending on profiles. This paper intends to extend the monitoring method to where explanatory variables are different for each profile. We compare the new method's performance through simulation and apply it to monitoring a network intrusion using NSL-KDD data.