Development of an Gaussian Process Model using a Data Filtering Method

데이터 필터링 기법을 적용한 가우시안 프로세스 모델의 개발

  • Received : 2016.01.06
  • Accepted : 2016.04.15
  • Published : 2016.04.30


For better energy management of existing buildings, an accurate and fast prediction model is required. For this purpose, this study reports the development of a GP (Gaussian Process) model for an AHU fan of the real high-rise office building. The GP Model is a statistical data driven model, and requires far less inputs and demands less computing time than the whole building simulation tools. In this paper, the following is addressed: 1) the characteristics of the GP model, 2) the development the GP model, and 3)removal of outliers gathered from BEMS data, 4) validation of the GP model. In particular, RANSAC (RANdom SAmple Consensus) was employed for detecting the outliers of the measured data. It is concluded that the GP model accurately predict the fan energy consumption, and can be used for real time optimal control and fault detection of building systems in near future.


Gaussian Process;Data Filtering;RANSAC;Building Energy Management System (BEMS)


  1. Abushakra, B. (1997). An Inverse Model to Predict and Evaluate the Energy Performance of Large Commercial and Institutional Buildings, in Proceedings of BS 1997, Prague, Czech Republic
  2. Allaire, D., & Willcox, K. (2010). Surrogate Modeling for Uncertainty Assessment with Application to Aviation Environmental System Models. AIAA Journal, 48(8), 1791-1803.
  3. Azman, K., & Kocijan, J. (2007). Application of Gaussian processes for black-box modelling of biosystems. ISA Transactions, 46(4), 443-457.
  4. Boskoski, P., Gasperin, M., Petelin, D., & Juricic, D. (2014). Bearing fault prognostics using Renyi entropy based features and Gaussian process models. Mechanical Systems and Signal Processing, 52-53, 327-337.
  5. Brahim-Belhouari, S., & Bermak, A. (2004). Gaussian process for nonstationary time series prediction. Computational Statistics & Data Analysis, 47(4), 705-712
  6. Brodley, C. E., & Friedl, M. A. (1999). Identifying Mislabeled Training Data. Journal of Artificial Intelligence Research, 11, 131-167.
  7. Eldred, M. S., Giunta, A. A., & Collins, S. S. (2004). Second-Order Corrections for Surrogate-Based Optimization with Model Hierarchies, in: Proceedings of 10th AIAA/ISSMO Multidisciplinary Analysis and Optimaization Conference, Albany, New York.
  8. Fisher, M. A., & Bolles, R. C. (1981). Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Communication of the ACM, 24(6), 381-395.
  9. Gilks, W. R., Richardson, S., & Spiegelhalter, D. J. (1995). Markov Chain Monte Carlo in Practice, Chapman and Hall
  10. IBPSA, International Building Performance Simulation Association, in: Proceedings of the 7th to 14th Building Simulation, 2011-2015.
  11. Kumar, R., Jayaraman, V.K., & Kulkarni, B. D. (2005). An SVM classifier incorporating simultaneous noise reduction and feature selection: illustrative vase examples. Patten Recognition, 38(1), 41-49.
  12. Li, Y., Wessels, L. F. A., de Ridder, D., & Reinders, M. J. T. (2007). Classification in the presence of class noise using a probabilistic kernel Fisher method. Pattern Recognition, 40(12), 3349-3357.
  13. Liu, D., Yamashita, Y., & Ogawa H. (1995). Pattern Recognition in the Presence of Noise. Pattern Recognition, 28(7), 989-995.
  14. Matas J., & Chum O. (2004). Randomized RANSAC with Td,d test. Image and Vision Computing, 22(10), 837-842.
  15. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective, MIT Press
  16. Murray-Smith, R., Johansen, T. A., & Shorten, R. (1999). On transient dynamics, off-equilibrium behaviour and identification in blended multiple model structures, in Proceedings of the European Control Conference, Karlsruhe, Germany
  17. Rao, S. S. (1996). Engineering optimization: Theory and practice, Wiley:New York
  18. Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian Processes for Machine Learning, MIT Press
  19. Roberts, S., Osborne, M., Ebden, M., Reece, S., Gibson, N., & Aigrain, S. (1984). Gaussian Process for Timeseries Modelling, Philosophical Transactions of the Royal Society (Part A), 371.
  20. Saez, J. A., Luengo, J., & Herrera, F. (2013). Predicting noise filtering efficacy with data complexity measures for nearest neighbor classification. Pattern Recognition, 46(1), 355-364.
  21. Sun, A. Y. Wang, D., & Xu, X. (2014). Monthly streamflow forecasting using Gaussian Process Regression. Journal of Hydrology, 511(16), 72-81.
  22. Wilson, D. L. (1972). Asymptotic properties of nearest neighbor rules using edited data, IEEE Transactions on Systems, Man and Cybernetics. SMC-2(3), 408-421.
  23. Wu, X., & Zhu, X. (2008). Mining With Noise Knowledge: Error-Aware Data Mining. IEEE Transactions on Systems, Man, and Cybernetics-Part A: System and Humans, 38(4), pp.917-932.
  24. Zhang, Y., O'Neill, Z., Wanger, T., & Augenbroe, G. A. (2013). An Inverse Model with Uncertainty Quantification to Estimate the Energy Performance of an Office Building, in Proceedings of BS2013, Chambery, France, August 26-28, 614-621.
  25. Zhu, X., & Wu, X. (2004). Class Noise vs. Attribute Noise: A Quantitative Study. Artificial Intelligence Review, 22(3), 177-210.
  26. Zuliani, M. (2014). RANSAC for Dummies, Retrieved September 29, 2014 from


Supported by : 한국에너지기술평가원(KETEP)