• Title/Summary/Keyword: Ensemble

Search Result 1,119, Processing Time 0.079 seconds

A Combination and Calibration of Multi-Model Ensemble of PyeongChang Area Using Ensemble Model Output Statistics (Ensemble Model Output Statistics를 이용한 평창지역 다중 모델 앙상블 결합 및 보정)

  • Hwang, Yuseon;Kim, Chansoo
    • Atmosphere
    • /
    • v.28 no.3
    • /
    • pp.247-261
    • /
    • 2018
  • The objective of this paper is to compare probabilistic temperature forecasts from different regional and global ensemble prediction systems over PyeongChang area. A statistical post-processing method is used to take into account combination and calibration of forecasts from different numerical prediction systems, laying greater weight on ensemble model that exhibits the best performance. Observations for temperature were obtained from the 30 stations in PyeongChang and three different ensemble forecasts derived from the European Centre for Medium-Range Weather Forecasts, Ensemble Prediction System for Global and Limited Area Ensemble Prediction System that were obtained between 1 May 2014 and 18 March 2017. Prior to applying to the post-processing methods, reliability analysis was conducted to identify the statistical consistency of ensemble forecasts and corresponding observations. Then, ensemble model output statistics and bias-corrected methods were applied to each raw ensemble model and then proposed weighted combination of ensembles. The results showed that the proposed methods provide improved performances than raw ensemble mean. In particular, multi-model forecast based on ensemble model output statistics was superior to the bias-corrected forecast in terms of deterministic prediction.

Typhoon Wukong (200610) Prediction Based on The Ensemble Kalman Filter and Ensemble Sensitivity Analysis (앙상블 칼만 필터를 이용한 태풍 우쿵 (200610) 예측과 앙상블 민감도 분석)

  • Park, Jong Im;Kim, Hyun Mee
    • Atmosphere
    • /
    • v.20 no.3
    • /
    • pp.287-306
    • /
    • 2010
  • An ensemble Kalman filter (EnKF) with Weather Research and Forecasting (WRF) Model is applied for Typhoon Wukong (200610) to investigate the performance of ensemble forecasts depending on experimental configurations of the EnKF. In addition, the ensemble sensitivity analysis is applied to the forecast and analysis ensembles generated in EnKF, to investigate the possibility of using the ensemble sensitivity analysis as the adaptive observation guidance. Various experimental configurations are tested by changing model error, ensemble size, assimilation time window, covariance relaxation, and covariance localization in EnKF. First of all, experiments using different physical parameterization scheme for each ensemble member show less root mean square error compared to those using single physics for all the forecast ensemble members, which implies that considering the model error is beneficial to get better forecasts. A larger number of ensembles are also beneficial than a smaller number of ensembles. For the assimilation time window, the experiment using less frequent window shows better results than that using more frequent window, which is associated with the availability of observational data in this study. Therefore, incorporating model error, larger ensemble size, and less frequent assimilation window into the EnKF is beneficial to get better prediction of Typhoon Wukong (200610). The covariance relaxation and localization are relatively less beneficial to the forecasts compared to those factors mentioned above. The ensemble sensitivity analysis shows that the sensitive regions for adaptive observations can be determined by the sensitivity of the forecast measure of interest to the initial ensembles. In addition, the sensitivities calculated by the ensemble sensitivity analysis can be explained by dynamical relationships established among wind, temperature, and pressure.

Improving Bagging Predictors

  • Kim, Hyun-Joong;Chung, Dong-Jun
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2005.11a
    • /
    • pp.141-146
    • /
    • 2005
  • Ensemble method has been known as one of the most powerful classification tools that can improve prediction accuracy. Ensemble method also has been understood as ‘perturb and combine’ strategy. Many studies have tried to develop ensemble methods by improving perturbation. In this paper, we propose two new ensemble methods that improve combining, based on the idea of pattern matching. In the experiment with simulation data and with real dataset, the proposed ensemble methods peformed better than bagging. The proposed ensemble methods give the most accurate prediction when the pruned tree was used as the base learner.

  • PDF

Optimization of Random Subspace Ensemble for Bankruptcy Prediction (재무부실화 예측을 위한 랜덤 서브스페이스 앙상블 모형의 최적화)

  • Min, Sung-Hwan
    • Journal of Information Technology Services
    • /
    • v.14 no.4
    • /
    • pp.121-135
    • /
    • 2015
  • Ensemble classification is to utilize multiple classifiers instead of using a single classifier. Recently ensemble classifiers have attracted much attention in data mining community. Ensemble learning techniques has been proved to be very useful for improving the prediction accuracy. Bagging, boosting and random subspace are the most popular ensemble methods. In random subspace, each base classifier is trained on a randomly chosen feature subspace of the original feature space. The outputs of different base classifiers are aggregated together usually by a simple majority vote. In this study, we applied the random subspace method to the bankruptcy problem. Moreover, we proposed a method for optimizing the random subspace ensemble. The genetic algorithm was used to optimize classifier subset of random subspace ensemble for bankruptcy prediction. This paper applied the proposed genetic algorithm based random subspace ensemble model to the bankruptcy prediction problem using a real data set and compared it with other models. Experimental results showed the proposed model outperformed the other models.

Improving an Ensemble Model Using Instance Selection Method (사례 선택 기법을 활용한 앙상블 모형의 성능 개선)

  • Min, Sung-Hwan
    • Journal of the Society of Korea Industrial and Systems Engineering
    • /
    • v.39 no.1
    • /
    • pp.105-115
    • /
    • 2016
  • Ensemble classification involves combining individually trained classifiers to yield more accurate prediction, compared with individual models. Ensemble techniques are very useful for improving the generalization ability of classifiers. The random subspace ensemble technique is a simple but effective method for constructing ensemble classifiers; it involves randomly drawing some of the features from each classifier in the ensemble. The instance selection technique involves selecting critical instances while deleting and removing irrelevant and noisy instances from the original dataset. The instance selection and random subspace methods are both well known in the field of data mining and have proven to be very effective in many applications. However, few studies have focused on integrating the instance selection and random subspace methods. Therefore, this study proposed a new hybrid ensemble model that integrates instance selection and random subspace techniques using genetic algorithms (GAs) to improve the performance of a random subspace ensemble model. GAs are used to select optimal (or near optimal) instances, which are used as input data for the random subspace ensemble model. The proposed model was applied to both Kaggle credit data and corporate credit data, and the results were compared with those of other models to investigate performance in terms of classification accuracy, levels of diversity, and average classification rates of base classifiers in the ensemble. The experimental results demonstrated that the proposed model outperformed other models including the single model, the instance selection model, and the original random subspace ensemble model.

Implementation of the Ensemble Kalman Filter to a Double Gyre Ocean and Sensitivity Test using Twin Experiments (Double Gyre 모형 해양에서 앙상블 칼만필터를 이용한 자료동화와 쌍둥이 실험들을 통한 민감도 시험)

  • Kim, Young-Ho;Lyu, Sang-Jin;Choi, Byoung-Ju;Cho, Yang-Ki;Kim, Young-Gyu
    • Ocean and Polar Research
    • /
    • v.30 no.2
    • /
    • pp.129-140
    • /
    • 2008
  • As a preliminary effort to establish a data assimilative ocean forecasting system, we reviewed the theory of the Ensemble Kamlan Filter (EnKF) and developed practical techniques to apply the EnKF algorithm in a real ocean circulation modeling system. To verify the performance of the developed EnKF algorithm, a wind-driven double gyre was established in a rectangular ocean using the Regional Ocean Modeling System (ROMS) and the EnKF algorithm was implemented. In the ideal ocean, sea surface temperature and sea surface height were assimilated. The results showed that the multivariate background error covariance is useful in the EnKF system. We also tested the sensitivity of the EnKF algorithm to the localization and inflation of the background error covariance and the number of ensemble members. In the sensitivity tests, the ensemble spread as well as the root-mean square (RMS) error of the ensemble mean was assessed. The EnKF produces the optimal solution as the ensemble spread approaches the RMS error of the ensemble mean because the ensembles are well distributed so that they may include the true state. The localization and inflation of the background error covariance increased the ensemble spread while building up well-distributed ensembles. Without the localization of the background error covariance, the ensemble spread tended to decrease continuously over time. In addition, the ensemble spread is proportional to the number of ensemble members. However, it is difficult to increase the ensemble members because of the computational cost.

Comparison of Ensemble Perturbations using Lorenz-95 Model: Bred vectors, Orthogonal Bred vectors and Ensemble Transform Kalman Filter(ETKF) (로렌쯔-95 모델을 이용한 앙상블 섭동 비교: 브레드벡터, 직교 브레드벡터와 앙상블 칼만 필터)

  • Chung, Kwan-Young;Barker, Dale;Moon, Sun-Ok;Jeon, Eun-Hee;Lee, Hee-Sang
    • Atmosphere
    • /
    • v.17 no.3
    • /
    • pp.217-230
    • /
    • 2007
  • Using the Lorenz-95 simple model, which can simulate many atmospheric characteristics, we compare the performance of ensemble strategies such as bred vectors, the bred vectors rotated (to be orthogonal to each bred member), and the Ensemble Transform Kalman Filter (ETKF). The performance metrics used are the RMSE of ensemble means, the ratio of RMS error of ensemble mean to the spread of ensemble, rank histograms to see if the ensemble member can well represent the true probability density function (pdf), and the distribution of eigen-values of the forecast ensemble, which can provide useful information on the independence of each member. In the meantime, the orthogonal bred vectors can achieve the considerable progress comparing the bred vectors in all aspects of RMSE, spread, and independence of members. When we rotate the bred vectors for orthogonalization, the improvement rate for the spread of ensemble is almost as double as that for RMS error of ensemble mean compared to the non-rotated bred vectors on a simple model. It appears that the result is consistent with the tentative test on the operational model in KMA. In conclusion, ETKF is superior to the other two methods in all terms of the assesment ways we used when it comes to ensemble prediction. But we cannot decide which perturbation strategy is better in aspect of the structure of the background error covariance. It appears that further studies on the best perturbation way for hybrid variational data assimilation to consider an error-of-the-day(EOTD) should be needed.

Optimal Selection of Classifier Ensemble Using Genetic Algorithms (유전자 알고리즘을 이용한 분류자 앙상블의 최적 선택)

  • Kim, Myung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.99-112
    • /
    • 2010
  • Ensemble learning is a method for improving the performance of classification and prediction algorithms. It is a method for finding a highly accurateclassifier on the training set by constructing and combining an ensemble of weak classifiers, each of which needs only to be moderately accurate on the training set. Ensemble learning has received considerable attention from machine learning and artificial intelligence fields because of its remarkable performance improvement and flexible integration with the traditional learning algorithms such as decision tree (DT), neural networks (NN), and SVM, etc. In those researches, all of DT ensemble studies have demonstrated impressive improvements in the generalization behavior of DT, while NN and SVM ensemble studies have not shown remarkable performance as shown in DT ensembles. Recently, several works have reported that the performance of ensemble can be degraded where multiple classifiers of an ensemble are highly correlated with, and thereby result in multicollinearity problem, which leads to performance degradation of the ensemble. They have also proposed the differentiated learning strategies to cope with performance degradation problem. Hansen and Salamon (1990) insisted that it is necessary and sufficient for the performance enhancement of an ensemble that the ensemble should contain diverse classifiers. Breiman (1996) explored that ensemble learning can increase the performance of unstable learning algorithms, but does not show remarkable performance improvement on stable learning algorithms. Unstable learning algorithms such as decision tree learners are sensitive to the change of the training data, and thus small changes in the training data can yield large changes in the generated classifiers. Therefore, ensemble with unstable learning algorithms can guarantee some diversity among the classifiers. To the contrary, stable learning algorithms such as NN and SVM generate similar classifiers in spite of small changes of the training data, and thus the correlation among the resulting classifiers is very high. This high correlation results in multicollinearity problem, which leads to performance degradation of the ensemble. Kim,s work (2009) showedthe performance comparison in bankruptcy prediction on Korea firms using tradition prediction algorithms such as NN, DT, and SVM. It reports that stable learning algorithms such as NN and SVM have higher predictability than the unstable DT. Meanwhile, with respect to their ensemble learning, DT ensemble shows the more improved performance than NN and SVM ensemble. Further analysis with variance inflation factor (VIF) analysis empirically proves that performance degradation of ensemble is due to multicollinearity problem. It also proposes that optimization of ensemble is needed to cope with such a problem. This paper proposes a hybrid system for coverage optimization of NN ensemble (CO-NN) in order to improve the performance of NN ensemble. Coverage optimization is a technique of choosing a sub-ensemble from an original ensemble to guarantee the diversity of classifiers in coverage optimization process. CO-NN uses GA which has been widely used for various optimization problems to deal with the coverage optimization problem. The GA chromosomes for the coverage optimization are encoded into binary strings, each bit of which indicates individual classifier. The fitness function is defined as maximization of error reduction and a constraint of variance inflation factor (VIF), which is one of the generally used methods to measure multicollinearity, is added to insure the diversity of classifiers by removing high correlation among the classifiers. We use Microsoft Excel and the GAs software package called Evolver. Experiments on company failure prediction have shown that CO-NN is effectively applied in the stable performance enhancement of NNensembles through the choice of classifiers by considering the correlations of the ensemble. The classifiers which have the potential multicollinearity problem are removed by the coverage optimization process of CO-NN and thereby CO-NN has shown higher performance than a single NN classifier and NN ensemble at 1% significance level, and DT ensemble at 5% significance level. However, there remain further research issues. First, decision optimization process to find optimal combination function should be considered in further research. Secondly, various learning strategies to deal with data noise should be introduced in more advanced further researches in the future.

Compressed Ensemble of Deep Convolutional Neural Networks with Global and Local Facial Features for Improved Face Recognition (얼굴인식 성능 향상을 위한 얼굴 전역 및 지역 특징 기반 앙상블 압축 심층합성곱신경망 모델 제안)

  • Yoon, Kyung Shin;Choi, Jae Young
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.8
    • /
    • pp.1019-1029
    • /
    • 2020
  • In this paper, we propose a novel knowledge distillation algorithm to create an compressed deep ensemble network coupled with the combined use of local and global features of face images. In order to transfer the capability of high-level recognition performances of the ensemble deep networks to a single deep network, the probability for class prediction, which is the softmax output of the ensemble network, is used as soft target for training a single deep network. By applying the knowledge distillation algorithm, the local feature informations obtained by training the deep ensemble network using facial subregions of the face image as input are transmitted to a single deep network to create a so-called compressed ensemble DCNN. The experimental results demonstrate that our proposed compressed ensemble deep network can maintain the recognition performance of the complex ensemble deep networks and is superior to the recognition performance of a single deep network. In addition, our proposed method can significantly reduce the storage(memory) space and execution time, compared to the conventional ensemble deep networks developed for face recognition.

Wind Prediction with a Short-range Multi-Model Ensemble System (단시간 다중모델 앙상블 바람 예측)

  • Yoon, Ji Won;Lee, Yong Hee;Lee, Hee Choon;Ha, Jong-Chul;Lee, Hee Sang;Chang, Dong-Eon
    • Atmosphere
    • /
    • v.17 no.4
    • /
    • pp.327-337
    • /
    • 2007
  • In this study, we examined the new ensemble training approach to reduce the systematic error and improve prediction skill of wind by using the Short-range Ensemble prediction system (SENSE), which is the mesoscale multi-model ensemble prediction system. The SENSE has 16 ensemble members based on the MM5, WRF ARW, and WRF NMM. We evaluated the skill of surface wind prediction compared with AWS (Automatic Weather Station) observation during the summer season (June - August, 2006). At first stage, the correction of initial state for each member was performed with respect to the observed values, and the corrected members get the training stage to find out an adaptive weight function, which is formulated by Root Mean Square Vector Error (RMSVE). It was found that the optimal training period was 1-day through the experiments of sensitivity to the training interval. We obtained the weighted ensemble average which reveals smaller errors of the spatial and temporal pattern of wind speed than those of the simple ensemble average.