Search | Korea Science

Multiple Deletions in Logistic Regression Models

Jung, Kang-Mo
- Communications for Statistical Applications and Methods
- /
- v.16 no.2
- /
- pp.309-315
- /
- 2009
We extended the results of Roy and Guria (2008) to multiple deletions in logistic regression models. Since single deletions may not exactly detect outliers or influential observations due to swamping effects and masking effects, it needs multiple deletions. We developed conditional deletion diagnostics which are designed to overcome problems of masking effects. We derived the closed forms for several statistics in logistic regression models. They give useful diagnostics on the statistics.
https://doi.org/10.5351/CKSS.2009.16.2.309 인용 PDF KSCI

Identifying Multiple Leverage Points ad Outliers in Multivariate Linear Models

Yoo, Jong-Young
- Communications for Statistical Applications and Methods
- /
- v.7 no.3
- /
- pp.667-676
- /
- 2000
This paper focuses on the problem of detecting multiple leverage points and outliers in multivariate linear models. It is well known that he identification of these points is affected by masking and swamping effects. To identify them, Rousseeuw(1985) used robust estimators of MVE(Minimum Volume Ellipsoids), which have the breakdown point of 50% approximately. And Rousseeuw and van Zomeren(1990) suggested the robust distance based on MVE, however, of which the computation is extremely difficult when the number of observations n is large. In this study, e propose a new algorithm to reduce the computational difficulty of MVE. The proposed method is powerful in identifying multiple leverage points and outlies and also effective in reducing the computational difficulty of MVE.
PDF

Identification of Regression Outliers Based on Clustering of LMS-residual Plots

Kim, Bu-Yong;Oh, Mi-Hyun
- Communications for Statistical Applications and Methods
- /
- v.11 no.3
- /
- pp.485-494
- /
- 2004
An algorithm is proposed to identify multiple outliers in linear regression. It is based on the clustering of residuals from the least median of squares estimation. A cut-height criterion for the hierarchical cluster tree is suggested, which yields the optimal clustering of the regression outliers. Comparisons of the effectiveness of the procedures are performed on the basis of the classic data and artificial data sets, and it is shown that the proposed algorithm is superior to the one that is based on the least squares estimation. In particular, the algorithm deals very well with the masking and swamping effects while the other does not.
https://doi.org/10.5351/CKSS.2004.11.3.485 인용 PDF KSCI

Outlier tests on potential outliers (잠재적 이상치군에 대한 검정)

Seo, Han Son
- The Korean Journal of Applied Statistics
- /
- v.30 no.1
- /
- pp.159-167
- /
- 2017
Observations identified as potential outliers are usually tested for real outliers; however, some outlier detection methods skip a formal test or perform a test using simulated p-values. We introduce test procedures for outliers by testing subsets of potential outliers rather than by testing individual observations of potential outliers to avoid masking or swamping effects. Examples to illustrate methods and a Monte Carlo study to compare the power of the various methods are presented.
https://doi.org/10.5351/KJAS.2017.30.1.159 인용 PDF KSCI

Unmasking Multiple Outliers in Multivariate Data

Yoo Jong-Young
- Communications for Statistical Applications and Methods
- /
- v.13 no.1
- /
- pp.29-38
- /
- 2006
We proposed a procedure for detecting of multiple outliers in multivariate data. Rousseeuw and van Zomeren (1990) have suggested the robust distance $RD_i$ by using the Resampling Algorithm. But $RD_i$ are based on the assumption that X is in the general position.(X is said to be in the general position when every subsample of size p+1 has rank p) From the practical points of view, this is clearly unrealistic. In this paper, we proposed a computing method for approximating MVE, which is not subject to these problems. The procedure is easy to compute, and works well even if subsample is singular or nearly singular matrix.
https://doi.org/10.5351/CKSS.2006.13.1.029 인용 PDF KSCI

Influential Points in GLMs via Backwards Stepping

Jeong, Kwang-Mo;Oh, Hae-Young
- Communications for Statistical Applications and Methods
- /
- v.9 no.1
- /
- pp.197-212
- /
- 2002
When assessing goodness-of-fit of a model, a small subset of deviating observations can give rise to a significant lack of fit. It is therefore important to identify such observations and to assess their effects on various aspects of analysis. A Cook's distance measure is usually used to detect influential observation. But it sometimes is not fully effective in identifying truly influential set of observations because there may exist masking or swamping effects. In this paper we confine our attention to influential subset In GLMs such as logistic regression models and loglinear models. We modify a backwards stepping algorithm, which was originally suggested for detecting outlying cells in contingency tables, to detect influential observations in GLMs. The algorithm consists of two steps, the identification step and the testing step. In identification step we Identify influential observations based on influencial measures such as Cook's distances. On the other hand in testing step we test the subset of identified observations to be significant or not Finally we explain the proposed method through two types of dataset related to logistic regression model and loglinear model, respectively.
https://doi.org/10.5351/CKSS.2002.9.1.197 인용 PDF KSCI

V-mask Type Criterion for Identification of Outliers In Logistic Regression

Kim Bu-Yong
- Communications for Statistical Applications and Methods
- /
- v.12 no.3
- /
- pp.625-634
- /
- 2005
A procedure is proposed to identify multiple outliers in the logistic regression. It detects the leverage points by means of hierarchical clustering of the robust distances based on the minimum covariance determinant estimator, and then it employs a V-mask type criterion on the scatter plot of robust residuals against robust distances to classify the observations into vertical outliers, bad leverage points, good leverage points, and regular points. Effectiveness of the proposed procedure is evaluated on the basis of the classic and artificial data sets, and it is shown that the procedure deals very well with the masking and swamping effects.
https://doi.org/10.5351/CKSS.2005.12.3.625 인용 PDF KSCI

The Sequential Testing of Multiple Outliers in Linear Regression

Park, Jinpyo;Park, Heechang
- Communications for Statistical Applications and Methods
- /
- v.8 no.2
- /
- pp.337-346
- /
- 2001
In this paper we consider the problem of identifying and testing the outliers in linear regression. first we consider the problem for testing the null hypothesis of no outliers. The test based on the ratio of two scale estimates is proposed. We show the asymptotic distribution of the test statistic by Monte Carlo simulation and investigate its properties. Next we consider the problem of identifying the outliers. A forward sequential procedure based on the suggested test is proposed and shown to perform fairly well. The forward sequential procedure is unaffected by masking and swamping effects because the test statistic is based on robust estimate.
PDF

The Scale Ratio Testing of Multiple Outliers in Linear Regression

Park, Jin-Pyo
- Journal of the Korean Data and Information Science Society
- /
- v.14 no.3
- /
- pp.673-685
- /
- 2003
In this paper we consider the problem of identifying and testing outliers in linear regression. First we consider the problem for testing the null hypothesis of no outliers. A test based on the ratio of two residual scale estimates is proposed. We show the asymptotic distribution of the test statistics by Monte Carlo simulation and investigate its properties. Next we consider the problem of identifying the outliers. A forward sequential procedure using the suggested test is proposed and shown to perform fairly well. Unlike other forward procedures, the present one is unaffected by masking and swamping effects because the test statistic is based on robust scale estimate.
PDF

The Forward Sequential Procedure for the Identifying Multiple Outliers in Linear Regression

Park, Jin-Pyo
- Journal of the Korean Data and Information Science Society
- /
- v.16 no.4
- /
- pp.1053-1066
- /
- 2005
In this paper we consider the problem of identifying and testing outliers in linear regression. First we consider the use of the so-called scale ratio tests for testing the null hypothesis of no outliers. This test is based on the ratio of two residual scale estimates. We show the asymptotic distribution of the test statistics and investigate its properties. Next we consider the problem of identifying the outliers. A forward sequential procedure using the suggested test is proposed. The new method is compared with classical procedure in the real data example. Unlike other forward procedures, the present one is unaffected by masking and swamping effects because the test statistic is based on robust scale estimate.
PDF

Search Result 14, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)