Robustness, Data Analysis, and Statistical Modeling: The First 50 Years and Beyond

Barrios, Erniel B.;

doi:10.5351/CSAM.2015.22.6.543

Communications for Statistical Applications and Methods

Volume 22 Issue 6
/
Pages.543-556
/
2015
/
2287-7843(pISSN)
/
2383-4757(eISSN)

The Korean Statistical Society (한국통계학회)

DOI QR Code

Robustness, Data Analysis, and Statistical Modeling: The First 50 Years and Beyond

Barrios, Erniel B. (School of Statistics, University of the Philippines Diliman)

Received : 2015.10.29
Accepted : 2015.11.20
Published : 2015.11.30

https://doi.org/10.5351/CSAM.2015.22.6.543 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

We present a survey of contributions that defined the nature and extent of robust statistics for the last 50 years. From the pioneering work of Tukey, Huber, and Hampel that focused on robust location parameter estimation, we presented various generalizations of these estimation procedures that cover a wide variety of models and data analysis methods. Among these extensions, we present linear models, clustered and dependent observations, times series data, binary and discrete data, models for spatial data, nonparametric methods, and forward search methods for outliers. We also present the current interest in robust statistics and conclude with suggestions on the possible future direction of this area for statistical science.

Keywords

References

Alhamzawi, R. (2015). Model selection in quantile regression models, Journal of Applied Statistics, 42, 445-458. https://doi.org/10.1080/02664763.2014.959905
Atkinson, A. C. (1994). Fast very robust methods for the detection of multiple outliers, Journal of the American Statistical Association, 89, 1329-1339. https://doi.org/10.1080/01621459.1994.10476872
Atkinson, A. C. (2009). Econometric applications of the forward search in regression: Robustness, diagnostics, and graphics, Econometric Reviews, 28, 21-39.
Atkinson, A. C. and Cheng, T. C. (2000). On robust linear regression with incomplete data, Computational Statistics & Data Analysis, 33, 361-380. https://doi.org/10.1016/S0167-9473(99)00061-4
Atkinson, A. C. and Riani, M. (2007a). Exploratory tools for clustering multivariate data, Computational Statistics & Data Analysis, 52, 272-285. https://doi.org/10.1016/j.csda.2006.12.034
Atkinson, A. C. and Riani, M. (2007b). Building regression models with the forward search, Journal of Computing and Information Technology, 15, 287-294. https://doi.org/10.2498/cit.1001135
Bastero, R. F. and Barrios, E. B. (2011). Robust estimation of a spatiotemporal model with structural change, Communications in Statistics-Simulation and Computation, 40, 448-468. https://doi.org/10.1080/03610918.2010.543298
Beran, R. (1982). Robust estimation in models for independent non-identically distributed data, The Annals of Statistics, 10, 415-428. https://doi.org/10.1214/aos/1176345783
Bertaccini, B. and Varriale, R. (2007). Robust analysis of variance: An approach based on the forward search, Computational Statistics & Data Analysis, 51, 5172-5183. https://doi.org/10.1016/j.csda.2006.08.010
Campano, W. Q. and Barrios, E. B. (2011). Robust estimation of a time series model with structural change, Journal of Statistical Computation and Simulation, 81, 909-927. https://doi.org/10.1080/00949650903575211
Cantoni, E. and Ronchetti, E. (2001). Robust inference for generalized linear models, Journal of the American Statistical Association, 96, 1022-1030. https://doi.org/10.1198/016214501753209004
Cao, F., Ye, H. and Wang, D. (2015). A probabilistic learning algorithm for robust modeling using neural networks with random weights, information sciences, 313, 62-78. https://doi.org/10.1016/j.ins.2015.03.039
Carroll, R. J. and Ruppert, D. (1982). Robust estimation in heteroscedastic linear models, The Annals of Statistics, 10, 429-441. https://doi.org/10.1214/aos/1176345784
Chang, L., Hu, B., Chang, G. and Li, A. (2013). Robust derivative-free Kalman filter based on Huber's M-estimation, Journal of Process Control, 23, 1555-1561. https://doi.org/10.1016/j.jprocont.2013.05.004
Cizek, P. (2008). Robust and efficient adaptive estimation of binary-choice regression models, Journal of the American Statistical Association, 103, 687-696. https://doi.org/10.1198/016214508000000175
Cizek, P. (2012). Semiparametric robust estimation of truncated and censored regression models, Journal of Econometrics, 168, 347-366. https://doi.org/10.1016/j.jeconom.2012.02.002
Cressie, N. and Hawkins, D. M. (1980). Robust estimation of the variogram: I, Mathematical Geology, 12, 115-125. https://doi.org/10.1007/BF01035243
Dang, V. A., Kim, M. and Shin, Y. (2015). In search of robust methods for dynamic panel data models in empirical corporate finance, Journal of Banking & Finance, 53, 84-98. https://doi.org/10.1016/j.jbankfin.2014.12.009
de Luna, X. and Genton, M. G. (2001). Robust simulation-based estimation of ARMA models, Journal of Computational and Graphical Statistics, 10, 370-387. https://doi.org/10.1198/10618600152628347
Dogan, O. and Taspinar, S. (2014). Spatial autoregressive models with unknown heteroscedasticity: A comparison of Bayesian and robust GMM approach, Regional Science and Urban Economics, 45, 1-21. https://doi.org/10.1016/j.regsciurbeco.2013.12.003
Field, C. A., Pang, Z. and Welsh, A. H. (2010). Bootstrapping robust estimates for clustered data, Journal of the American Statistical Association, 105, 1606-1616. https://doi.org/10.1198/jasa.2010.tm09541
Furno, M. (2004). ARCH tests and quantile regressions, Journal of Statistical Computation and Simulation, 74, 277-292. https://doi.org/10.1080/0094965031000151178
Gaglianone, W. P., Lima, L. R., Linton, O. and Smith, D. R. (2011). Evaluating value-at-risk models via quantile regression, Journal of Business & Economic Statistics, 29, 150-160. https://doi.org/10.1198/jbes.2010.07318
Hampel, F. R. (1971). A general qualitative definition of robustness, The Annals of Mathematical Statistics, 42, 1887-1896. https://doi.org/10.1214/aoms/1177693054
Hampel, F. R. (1973). Robust estimation: A condensed partial survey, Zeitschrift fur Wahrscheinlichkeitstheorie und Verwandte Gebiete, 27, 87-104. https://doi.org/10.1007/BF00536619
Hampel, F. R. (1974). The influence curve and its role in robust estimation, Journal of the American Statistical Association, 69, 383-393. https://doi.org/10.1080/01621459.1974.10482962
Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986). Robust Statistics: The Approach Based on Influence Functions, John Wiley & Sons, New York.
Hardle, W. (1984). Robust regression function estimation, Journal of Multivariate Analysis, 14, 169-180. https://doi.org/10.1016/0047-259X(84)90003-4
Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models, Chapman and Hall, London.
He, X. and Zhu, L. X. (2003). A lack-of-fit test for quantile regression, Journal of the American Statistical Association, 98, 1013-1022. https://doi.org/10.1198/016214503000000963
He, X., Fung, W. Z. and Zhu, Z. (2005). Robust estimation in generalized partial linear models for clustered data, Journal of the American Statistical Association, 100, 1176-1184. https://doi.org/10.1198/016214505000000277
Hettmansperger, T. P. and McKean, J. W. (1988). Robust Nonparametric Statistical Methods, Arnold, London.
Hettmansperger, T. P., McKean, J. W., and Sheather, S. J. (2000). Robust nonparametric methods, Journal of the American Statistical Association, 95, 1308-1312. https://doi.org/10.1080/01621459.2000.10474337
Hoshino, T. (2014). Quantile regression estimation of partially linear additive models, Journal of Nonparametric Statistics, 26, 509-536. https://doi.org/10.1080/10485252.2014.929675
Huang, A. Y. H. (2012). Volatility forecasting by quantile regression, Applied Economics, 44, 423-433. https://doi.org/10.1080/00036846.2010.508727
Huber, P. J. (1964). Robust estimation of a location parameter, The Annals of Mathematical Statistics, 35, 73-101. https://doi.org/10.1214/aoms/1177703732
Huber, P. J. (1972). The 1972 wald lecture robust statistics: A review, The Annals of Mathematical Statistics, 43, 1041-1067. https://doi.org/10.1214/aoms/1177692459
Huber, P. J. (1973). Robust regression: Asymptotics, conjectures and Monte Carlo, The Annals of Statistics, 1, 799-821. https://doi.org/10.1214/aos/1176342503
Huber, P. J. (2002). John W. Tukey's contributions to robust statistics, The Annals of Statistics, 30, 1640-1648. https://doi.org/10.1214/aos/1043351251
Huber, P. J. and Ronchetti, E. M. (2009). Robust Statistics, 2nd ed., John Wiley and Sons, New York.
Hubert, M. and Rousseeuw, P. J. (1997). Robust regression with both continuous and binary regressors, Journal of Statistical Planning and Inference, 57, 153-163. https://doi.org/10.1016/S0378-3758(96)00041-9
Hung, K. W. and Siu, W. C. (2015). Learning-based image interpolation via robust k-NN searching for coherent AR parameters estimation, Journal of Visual Communication Image Representation, 31, 305-311. https://doi.org/10.1016/j.jvcir.2015.07.006
Karunamuni, R. J., Tang, Q. and Zhao, B. (2015). Robust and efficient estimation of effective dose, Computational Statistics & Data Analysis, 90, 47-60. https://doi.org/10.1016/j.csda.2015.04.001
Kelly, G. E. and Lindsey, J. K. (2002). Robust estimation of the median lethal dose, Journal of Biopharmaceutical Statistics, 12, 137-147. https://doi.org/10.1081/BIP-120014416
Kitromilidou, S. and Fokianos, K. (2015). Robust estimation methods for a class of log-linear count time series models, Journal of Statistical Computation and Simulation, DOI: 10.1080/00949655.2015.1035271.
Kim, M. O. and Yang, Y. (2011). Semiparametric approach to a random effects quantile regression, Journal of the American Statistical Association, 106, 1405-1417. https://doi.org/10.1198/jasa.2011.tm10470
Li, Y. and Zhu, J. (2008). L1-norm quantile regression, Journal of Computational and Graphical Statistics, 17, 163-185. https://doi.org/10.1198/106186008X289155
Lv, Z., Zhu, H. and Yu, K. (2014). Robust variable selection for nonlinear models with diverging number of parameters, Statistics & Probability Letters, 91, 90-97. https://doi.org/10.1016/j.spl.2014.04.013
Mann, H. B. and Wald, A. (1942). On the choice of the number of class intervals in the application of the chi square test, The Annals of Mathematical Statistics, 13, 306-317. https://doi.org/10.1214/aoms/1177731569
Maronna, R. A. and Zamar, R. H. (2002). Robust estimates of location and dispersion for high dimensional datasets, Technometrics, 44, 307-317. https://doi.org/10.1198/004017002188618509
Mavridis, D. and Moustaki, I. (2009). The forward search algorithm for detecting response patterns in factor analysis for binary data, Journal of Computational and Graphical Statistics, 18, 1016-1034. https://doi.org/10.1198/jcgs.2009.08060
Moscone, F. and Tosetti, E. (2015). Robust estimation under error cross section dependence, Economics Letters, 133, 100-104. https://doi.org/10.1016/j.econlet.2015.05.020
Nassiri, V. and Loris, I. (2013). A generalized quantile regression model, Journal of Applied Statistics, 40, 1090-1105. https://doi.org/10.1080/02664763.2013.780158
Perez, B., Molina, I. and Pena, D. (2014). Outlier detection and robust estimation in linear regression models with fixed group effects, Journal of Statistical Computation and Simulation, 84, 2652-2669. https://doi.org/10.1080/00949655.2013.811669
Riani, M. (2004). Extensions of the forward search to time series, Studies in Nonlinear Dynamics & Econometrics, 8, Article 2.
Rieder, H. (1996). Robust Statistics, Data Analysis, and Computer Intensive Methods, Springer-Verlag, New York.
Sacks, J. and Ylvisaker, D. (1972). A note of Huber's robust estimation of a location parameter, The Annals of Mathematical Statistics, 43, 1068-1075. https://doi.org/10.1214/aoms/1177692460
Santos, K. C. P. and Barrios, E. B. (2015). Improving predictive accuracy of logistic regression model using ranked set samples, Communications in Statistics-Simulation and Computation, DOI: 10.1080/03610918.2014.955113.
Shahriari, H. and Ahmadi, O. (2015). Robust estimation of the mean vector for high-dimensional data set using robust clustering, Journal of Applied Statistics, 42, 1183-1205. https://doi.org/10.1080/02664763.2014.999030
Tukey, J. W. (1962). The future of data analysis, The Annals of Mathematical Statistics, 33, 1-67. https://doi.org/10.1214/aoms/1177704711
Ursu, E. and Pereau, J. C. (2014). Robust modelling of periodic vector autoregressive time series, Journal of Statistical Planning and Inference, 155, 93-106. https://doi.org/10.1016/j.jspi.2014.07.005
Vretos, N., Tefas, A. and Pitas, I. (2013). Using robust dispersion estimation in support vector machines, Pattern Recognition, 46, 3441-3451. https://doi.org/10.1016/j.patcog.2013.05.016
Wang, Y., Fan, Y., Bhatt, P. and Davatzikos, C. (2010). High-dimensional pattern regression using machine learning: From medical images to continuous clinical variables, Neuroimage, 50, 1519-1535. https://doi.org/10.1016/j.neuroimage.2009.12.092
Wei, Y. and Carroll, R. J. (2009). Quantile regression with measurement error, Journal of American Statistical Association, 104, 1129-1143. https://doi.org/10.1198/jasa.2009.tm08420
Wong, R. K.W., Yao, F. and Lee, T. C. M. (2014). Robust estimation for generalized additive models, Journal of Computational and Graphical Statistics, 23, 270-289. https://doi.org/10.1080/10618600.2012.756816
Xiao, Z. (2012). Robust inference in nonstationary time series models, Journal of Econometrics, 169, 211-223. https://doi.org/10.1016/j.jeconom.2012.01.027
Zhao, J. and Wang, J. (2009). Robust testing procedures in heteroscedastic linear models, Communications in Statistics-Simulation and Computation, 38, 244-256. https://doi.org/10.1080/03610910802468666

Communications for Statistical Applications and Methods

Robustness, Data Analysis, and Statistical Modeling: The First 50 Years and Beyond

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)