- Volume 19 Issue 4
We frequently use Tukey's boxplot to identify outliers in the batch of observations of the continuous variable. In doing so, we implicitly assume that the underlying distribution belongs to the family of normal distributions. Such a practice of data handling is often superficial and improper, since in reality too many variables manifest the skewness. In this short paper, we build a modified boxplot and set the outlier identification procedure by assuming that the observations are generated from the skew normal distribution (Azzalini, 1985), which is an extension of the normal distribution. Statistical performance of the proposed procedure is examined with simulated datasets.
Boxplot;outlier;skew normal distribution
- Azzalini, A. (1985). A class of distribution which includes the normal ones, Scandinavian Journal of Statistics, 12, 171-178.
- Azzalini, A. (2011). R package sn (Version 0.4). http://www.r-project.org.
- Hubert, M. and Vandervieren, E. (2008). An adjusted boxplot for skewed distributions, Computational Statistics and Data Analysis, 52, 5186-5201. https://doi.org/10.1016/j.csda.2007.11.008
- Kim, S. (2010). New calibration methods with asymmetric data, Korean Journal of Applied Statistics, 23, 759-765. https://doi.org/10.5351/KJAS.2010.23.4.759
- Seo, H. S., Shin, J. K. and Kim, H. M. (2009). Projected circular and l-axial skew-normal distributions, Korean Journal of Applied Statistics, 22, 879-891. https://doi.org/10.5351/KJAS.2009.22.4.879