An Empirical Characteristic Function Approach to Selecting a Transformation to Normality Yeo, In-Kwon; Johnson, Richard A.; Deng, XinWei;
In this paper, we study the problem of transforming to normality. We propose to estimate the transformation parameter by minimizing a weighted squared distance between the empirical characteristic function of transformed data and the characteristic function of the normal distribution. Our approach also allows for other symmetric target characteristic functions. Asymptotics are established for a random sample selected from an unknown distribution. The proofs show that the weight function needs to be modified to have thinner tails. We also propose the method to compute the influence function for M-equation taking the form of U-statistics. The influence function calculations and a small Monte Carlo simulation show that our estimates are less sensitive to a few outliers than the maximum likelihood estimates.
Andrews, D. F. (1971). A note on the selection of data transformations, Biometrika, 58, 249-254.
Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations, Journal of the Royal Statistical Society, B 26, 211-252.
Carroll, R. J. (1980). A robust method for testing transformations to achieve approximate normality, Journal of the Royal Statistical Society, B 42, 71-78.
Cook, R. A. and Wang, P. C. (1983). Transformations and influential cases in regression, Technometrics, 25, 337-345.
Epps, T. W. and Pulley, L. B. (1983). A test for normality based on the empirical characteristic function, Biometrika, 70, 723-726.
Fan, Y. (1997). Goodness-of-fit tests for a multivariate distribution by the empirical characteristic function, Journal of Multivariate Analysis, 62, 36-63.
Hinkley, D. V. (1975). On power transformations to symmetry, Biometrika, 62, 101-111.
Hinkley, D. V. and Wang, S. (1988). More about transformations and influential cases in regression. Technometrics, 30, 435-440.
Huber, P. J. (1964). Robust estimation of a location parameter, Annals of Statistics, 53, 73-101.
Jimenez-Gamero, M. D., Alba-Fernandez, V., Munoz-Garcia, J. and Chalco-Cano, Y. (2009). Goodness-of-fit tests based on empirical characteristic functions, Computational Statistics and Data Analysis, 53, 3957-3971.
John, J. A. and Draper, N. R. (1980). An alternative family of transformations, Applied Statistics, 29, 190-197.
Kim, C., Storer, B. E. and Jeong, M. (1996). A note on Box-Cox transformation diagnostics, Technometrics, 38, 178-180.
Klar, B. and Meintanis, S. G. (2005) Tests for normal mixtures based on the empirical characteristic function, Computational Statistics and Data Analysis, 49, 227-242.
Koutrouvelis, I. A. (1980). A goodness-of-fit test of simple hypothesis based on the empirical characteristic function, Biometrika, 67, 238-240.
Koutrouvelis, I. A. and Kellermeier, J. (1981). A goodness-of-fit based on the empirical characteristic function when parameters must be estimated, Journal of the Royal Statistical Society, B 43, 173-176.
Lee, A. J. (1990). U-statistics: Theory and Practice, Marcel Dekker, New York.
Rubin, H. (1956). Uniform convergence of random functions with applications to statistics. Annals of Mathematical Statistics, 27, 200-203.
Szekely, G. J., Rizzo, M. L. and Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances, Annals of Statistics, 35, 2769-2794.
Taylor, J. M. G. (1985). Power Transformations to Symmetry, Biometrika, 72, 145-152.
Tsai, C. L. and Wu, X. (1990). Diagnostics in transformation and weighted regression, Technometrics, 32, 315-322.
van Zwet, W. R. (1964). Convex transformations of random variables, Amsterdam: Mathematisch Centrum,
Yeo, I. K. (2001). Selecting a transformation to reduce skewness, Journal of the Korean Statistical Society, 30, 563-571.
Yeo, I. K. and Johnson, R. A. (2000). A new family of power transformations to improve normality or symmetry, Biometrika, 87, 954-959.
Yeo, I. K. and Johnson, R. A. (2001). A uniform strong law of large numbers for U-statistics with application to transforming to near symmetry, Statistics and Probability Letters, 51, 63-69.