Impact of Outliers on the Statistical Measures of the Environmental Monitoring Data in Busan Coastal Sea

- Journal title : Ocean and Polar Research
- Volume 38, Issue 2, 2016, pp.149-159
- Publisher : Korea Institute of Ocean Science & Technology
- DOI : 10.4217/OPR.2016.38.2.149

Title & Authors

Impact of Outliers on the Statistical Measures of the Environmental Monitoring Data in Busan Coastal Sea

Cho, Hong-Yeon; Lee, Ki-Seop; Ahn, Soon-Mo;

Cho, Hong-Yeon; Lee, Ki-Seop; Ahn, Soon-Mo;

Abstract

The statistical measures of the coastal environmental data are used in a variety of statistical inferences, hypothesis tests, and data-driven modeling. If the measures are biased, then the statistical estimations and models may also be biased and this potential for bias is great when data contain some outliers defined as extraordinary large or small data values. This study aims to suggest more robust statistical measures as alternatives to more commonly used measures and to assess the performance these robust measures through a quantitative evaluation of more typical measures, such as in terms of locations, spreads, and shapes, with regard to environmental monitoring data in the Busan coastal sea. The detection of outliers within the data was carried out on the basis of Rosner`s test. About 5-10% of the nutrient data were found to contain outliers based on Rosner`s test. After removal (zero-weighting) of the outliers in the data sets, the relative change ratios of the mean and standard deviation between before and after outlier-removal conditions revealed the figures 13 and 33%, respectively. The variation magnitudes of skewness and kurtosis are 1.36 and 8.11 in a decreasing trend, respectively. On the other hand, the change ratios for more robust measures regarding the mean and standard deviation are 3.7-10.5%, and the variation magnitudes of robust skewness and kurtosis are about only 2-4% of the magnitude of the non-robust measures. The robust measures can be regarded as outlier-resistant statistical measures based on the relatively small changes in the scenarios before and after outlier removal conditions.

Keywords

statistical measures;robust measures;outlier;Rosner`s test;Busan coastal sea;

Language

Korean

References

1.

국가해양환경정보통합시스템 (2016) 해양환경측정망 원본자료. http://www.meis.go.kr/ Accessed 18 Jan 2016 (Marine Environment Information System (2016) Raw data - marine environmental monitoring network. http://www.meis.go.kr/ Accessed 18 Jan 2016)

2.

해양환경관리공단 (2015) 국가해양환경측정망 자료 - 부산연안. http://www.koem.or.kr/ Accessed 18 Jan 2016 (Korea Marine Environment Management Corporation (2016) http://www.koem.or.kr/ National marine environmental monitoring network data - Busan coastal sea Accessed 18 Jan 2016)

3.

Barnett V, Lewis T (1994) Outliers in statistical data. John Wiley Sons, 584 p

4.

Bonato M (2011) Robust estimation of skewness and kurtosis in distributions with infinite higher moments. Financ Res Lett 8:77-87

5.

7.

Erceg-Hurn DM, Mirosevich VM (2008) Modern robust statistical methods: an easy way to maximize the accuracy and power of your research. Am Psychol 63(7):591-601

9.

Huber PJ, Ronchetti EM (2009) Robust statistics. John Wiley & Sons, New York, 380 p

10.

Kim T-H, White H (2004) On more robust estimation of skewness and kurtosis Financ. Res Lett 1:56-73

11.

Martinez WL, Martinez AR (2005) Exploratory data analysis with MATLAB. Chapman & Hall/CRC, Boca Raton, 405 p

12.

Millard SP (2013) EnvStats: an R package for environmental statistics. Springer, New York, 291 p

13.

Moors JJA (1988) A quantile alternative for kurtosis. J Roy Stat Soc D-Sta 37(1):25-32

14.

R Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://cloud.r-project.org/index.html Accessed 18 Jan 2016

15.

Rosner B (1983) Percentage points for a generalized ESD many-outlier procedure. Technometrics 25(2):165-172

16.

Rousseeuw PJ, Croux C (1993) Alternatives to the median absolute deviation. J Am Stat Assoc 88(424):1273-1283

17.

Rousseeuw PJ, LeRoy AM (2003) Robust regression and outlier detection. John Wiley & Sons, New Jersey, 329 p