Visualizing Multi-Variable Prediction Functions by Segmented k-CPG's

Title & Authors
Visualizing Multi-Variable Prediction Functions by Segmented k-CPG's
Huh, Myung-Hoe;

Abstract
Machine learning methods such as support vector machines and random forests yield nonparametric prediction functions of the form y = $\small{f(x_1,{\ldots},x_p)}$. As a sequel to the previous article (Huh and Lee, 2008) for visualizing nonparametric functions, I propose more sensible graphs for visualizing y = $\small{f(x_1,{\ldots},x_p)}$ herein which has two clear advantages over the previous simple graphs. New graphs will show a small number of prototype curves of $\small{f(x_1,{\ldots},x_{j-1},x_j,x_{j+1}{\ldots},x_p)}$, revealing statistically plausible portion over the interval of $\small{x_j}$ which changes with ($\small{x_1,{\ldots},x_{j-1},x_{j+1},{\ldots},x_p}$). To complement the visual display, matching importance measures for each of p predictor variables are produced. The proposed graphs and importance measures are validated in simulated settings and demonstrated for an environmental study.
Keywords
Visualization of prediction functions;k-Means clustering;variable importance;support vector machine;random forests;environmental data;
Language
English
Cited by
1.
Visualizing SVM Classification in Reduced Dimensions,;;

Communications for Statistical Applications and Methods, 2009. vol.16. 5, pp.881-889
References
1.
Breiman, L. (2001). Random forests, Machine Learning, 45, 5-32

2.
Breiman, L. and Friedman, J. (1985). Estimating optimal transformations for multiple regression and correlation, Journal of the American Statistical Association, 80, 580-598

3.
Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning, Springer, New York

4.
Huh, M. H. and Lee, Y. (2008). Simple graphs for complex prediction functions, Communications of the Korean Statistical Society, 15, 343-351

5.
Strobl, C., Boulesteix, A., Kneib., T., Augustin, T. and Zeileis, A. (2008). Conditioning variable importance for random forests, BMC Bioinformatics, 9, 307