Double-Bagging Ensemble Using WAVE

Kim, Ahhyoun;Kim, Minji;Kim, Hyunjoong;

doi:10.5351/CSAM.2014.21.5.411

Communications for Statistical Applications and Methods

제21권5호
/
Pages.411-422
/
2014
/
2287-7843(pISSN)
/
2383-4757(eISSN)

한국통계학회 (The Korean Statistical Society)

DOI QR Code

Double-Bagging Ensemble Using WAVE

Kim, Ahhyoun (Department of Applied Statistics, Yonsei University) ;
Kim, Minji (Department of Applied Statistics, Yonsei University) ;
Kim, Hyunjoong (Department of Applied Statistics, Yonsei University)

투고 : 2014.06.08
심사 : 2014.07.29
발행 : 2014.09.30

https://doi.org/10.5351/CSAM.2014.21.5.411 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

A classification ensemble method aggregates different classifiers obtained from training data to classify new data points. Voting algorithms are typical tools to summarize the outputs of each classifier in an ensemble. WAVE, proposed by Kim et al. (2011), is a new weight-adjusted voting algorithm for ensembles of classifiers with an optimal weight vector. In this study, when constructing an ensemble, we applied the WAVE algorithm on the double-bagging method (Hothorn and Lausen, 2003) to observe if any significant improvement can be achieved on performance. The results showed that double-bagging using WAVE algorithm performs better than other ensemble methods that employ plurality voting. In addition, double-bagging with WAVE algorithm is comparable with the random forest ensemble method when the ensemble size is large.

키워드

참고문헌

Asuncion, A. and Newman, D. J. (2007). UCI machine learning repository, University of California, Irvine, School of Information and Computer Sciences, http://archive.ics.uci.edu/ml/.
Bauer, E. and Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bag-ging, boosting, and variants, Machine Learning, 36, 105-139. https://doi.org/10.1023/A:1007515423169
Breiman, L. (1996a). Bagging predictors, Machine Learning, 24, 123-140.
Breiman, L. (1996b). Out-of-bag estimation, Technical Report, Statistics Department, University of California Berkeley, Berkeley, California 94708, http://www.stat.berkeley.edu/ breiman/ OOBes-timation.pdf.
Breiman, L. (2001). Random forests, Machine Learning, 45, 5-32. https://doi.org/10.1023/A:1010933404324
Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and Regression Trees, Chapman and Hall, New York.
Dietterich, T. (2000). Ensemble Methods in Machine Learning, Springer, Berlin.
Efron, B. and Tibshirani, R. (1986). Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy, Statistical Science, 1, 54-75. https://doi.org/10.1214/ss/1177013815
Freund, Y. and Schapire, R. (1996). Experiments with a new boosting algorithm, In Proceedings of the Thirteenth International Conference on Machine Learning, 96, 148-156.
Heinz, G., Peterson, L. J., Johnson, R. W. and Kerk, C. J. (2003). Exploring relationships in body dimensions, Journal of Statistics Education, 11, http://www.amstat.org/publications/jse/v11n2/datasets.heinz.html.
Ho, T. K., Hull, J. J. and Srihari, S. N. (1994). Decision combination in multiple classifier systems, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832-844.
Hothorn, T. and Lausen, B. (2003). Double-bagging: Combining classifiers by bootstrap aggregation, Pattern Recognition, 36, 1303-1309. https://doi.org/10.1016/S0031-3203(02)00169-3
Kim, H. and Loh, W. Y. (2001). Classification trees with unbiased multiway splits, Journal of American Statistical Association, 96, 589-604. https://doi.org/10.1198/016214501753168271
Kim, H. and Loh, W. Y. (2003). Classification trees with bivariate linear discriminant node models, Journal of Computational and Graphical Statistics, 12, 512-530. https://doi.org/10.1198/1061860032049
Kim, H., Kim, H., Moon, H. and Ahn, H. (2011). A weight-adjusted voting algorithm for ensembles of classifiers, Journal of the Korean Statistical Society, 40, 437-449. https://doi.org/10.1016/j.jkss.2011.03.002
Liew, A. and Wiener, M. (2002). Classification and regression by random forest, R News, 2, 18-22.
Loh, W. Y. (2009). Improving the precision of classification trees, The Annals of Applied Statistics, 3, 1710-1737. https://doi.org/10.1214/09-AOAS260
Opitz, D. and Maclin, R. (1999). Popular ensemble methods: An empirical study, Journal of Artificial Intelligence Research, 11, 169-198.
Oza, N. C. and Tumer, K. (2008). Classifier ensembles: Select real-world applications, Information Fusion, 9, 4-20. https://doi.org/10.1016/j.inffus.2007.07.002
Skurichina, M. and Duin, R. P. (1998). Bagging for linear classifiers, Pattern Recognition, 31, 909-930. https://doi.org/10.1016/S0031-3203(97)00110-6
Statlib (2010). Datasets archive, Carnegie Mellon University, Department of Statistics, http://lib.stat.cmu.edu.
Terhune, J. M. (1994). Geographical variation of harp seal underwater vocalisations, Canadian Journal of Zoology, 72, 892-897. https://doi.org/10.1139/z94-121
Therneau, T. and Atkinson, E. (1997). An introduction to recursive partitioning using the RPART routines, Mayo Foundation, Rochester, New York. http://eric.univlyon2.fr/ricco/cours/didacticiels/r/longdocrpart.pdf.
Tumer, K. and Oza, N. C. (2003). Input decimated ensembles, Pattern Analysis and Applications, 6, 65-77. https://doi.org/10.1007/s10044-002-0181-7
Zhu, J., Zou, H., Rosset, S. and Hastie, T. (2009). Multi-class AdaBoost, Statistics and Its Interface, 2, 349-360. https://doi.org/10.4310/SII.2009.v2.n3.a8

Communications for Statistical Applications and Methods

Double-Bagging Ensemble Using WAVE

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)