• Title/Summary/Keyword: Subspace methods

Search Result 159, Processing Time 0.023 seconds

Improving an Ensemble Model Using Instance Selection Method (사례 선택 기법을 활용한 앙상블 모형의 성능 개선)

  • Min, Sung-Hwan
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.39 no.1
    • /
    • pp.105-115
    • /
    • 2016
  • Ensemble classification involves combining individually trained classifiers to yield more accurate prediction, compared with individual models. Ensemble techniques are very useful for improving the generalization ability of classifiers. The random subspace ensemble technique is a simple but effective method for constructing ensemble classifiers; it involves randomly drawing some of the features from each classifier in the ensemble. The instance selection technique involves selecting critical instances while deleting and removing irrelevant and noisy instances from the original dataset. The instance selection and random subspace methods are both well known in the field of data mining and have proven to be very effective in many applications. However, few studies have focused on integrating the instance selection and random subspace methods. Therefore, this study proposed a new hybrid ensemble model that integrates instance selection and random subspace techniques using genetic algorithms (GAs) to improve the performance of a random subspace ensemble model. GAs are used to select optimal (or near optimal) instances, which are used as input data for the random subspace ensemble model. The proposed model was applied to both Kaggle credit data and corporate credit data, and the results were compared with those of other models to investigate performance in terms of classification accuracy, levels of diversity, and average classification rates of base classifiers in the ensemble. The experimental results demonstrated that the proposed model outperformed other models including the single model, the instance selection model, and the original random subspace ensemble model.

Optimization of Random Subspace Ensemble for Bankruptcy Prediction (재무부실화 예측을 위한 랜덤 서브스페이스 앙상블 모형의 최적화)

  • Min, Sung-Hwan
    • Journal of Information Technology Services
    • /
    • v.14 no.4
    • /
    • pp.121-135
    • /
    • 2015
  • Ensemble classification is to utilize multiple classifiers instead of using a single classifier. Recently ensemble classifiers have attracted much attention in data mining community. Ensemble learning techniques has been proved to be very useful for improving the prediction accuracy. Bagging, boosting and random subspace are the most popular ensemble methods. In random subspace, each base classifier is trained on a randomly chosen feature subspace of the original feature space. The outputs of different base classifiers are aggregated together usually by a simple majority vote. In this study, we applied the random subspace method to the bankruptcy problem. Moreover, we proposed a method for optimizing the random subspace ensemble. The genetic algorithm was used to optimize classifier subset of random subspace ensemble for bankruptcy prediction. This paper applied the proposed genetic algorithm based random subspace ensemble model to the bankruptcy prediction problem using a real data set and compared it with other models. Experimental results showed the proposed model outperformed the other models.

Tutorial: Dimension reduction in regression with a notion of sufficiency

  • Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.2
    • /
    • pp.93-103
    • /
    • 2016
  • In the paper, we discuss dimension reduction of predictors ${\mathbf{X}}{\in}{{\mathbb{R}}^p}$ in a regression of $Y{\mid}{\mathbf{X}}$ with a notion of sufficiency that is called sufficient dimension reduction. In sufficient dimension reduction, the original predictors ${\mathbf{X}}$ are replaced by its lower-dimensional linear projection without loss of information on selected aspects of the conditional distribution. Depending on the aspects, the central subspace, the central mean subspace and the central $k^{th}$-moment subspace are defined and investigated as primary interests. Then the relationships among the three subspaces and the changes in the three subspaces for non-singular transformation of ${\mathbf{X}}$ are studied. We discuss the two conditions to guarantee the existence of the three subspaces that constrain the marginal distribution of ${\mathbf{X}}$ and the conditional distribution of $Y{\mid}{\mathbf{X}}$. A general approach to estimate them is also introduced along with an explanation for conditions commonly assumed in most sufficient dimension reduction methodologies.

Note on the estimation of informative predictor subspace and projective-resampling informative predictor subspace (다변량회귀에서 정보적 설명 변수 공간의 추정과 투영-재표본 정보적 설명 변수 공간 추정의 고찰)

  • Yoo, Jae Keun
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.5
    • /
    • pp.657-666
    • /
    • 2022
  • An informative predictor subspace is useful to estimate the central subspace, when conditions required in usual suffcient dimension reduction methods fail. Recently, for multivariate regression, Ko and Yoo (2022) newly defined a projective-resampling informative predictor subspace, instead of the informative predictor subspace, by the adopting projective-resampling method (Li et al. 2008). The new space is contained in the informative predictor subspace but contains the central subspace. In this paper, a method directly to estimate the informative predictor subspace is proposed, and it is compapred with the method by Ko and Yoo (2022) through theoretical aspects and numerical studies. The numerical studies confirm that the Ko-Yoo method is better in the estimation of the central subspace than the proposed method and is more efficient in sense that the former has less variation in the estimation.

Application of the Krylov Subspace Method to the Incompressible Navier-Stokes Equations (비압축성 Navier-Stokes 방정식에 대한 Krylov 부공간법의 적용)

  • Maeng, Joo-Sung;Choi, IL-Kon;Lim, Youn-Woo
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.24 no.7
    • /
    • pp.907-915
    • /
    • 2000
  • The preconditioned Krylov subspace methods were applied to the incompressible Navier-Stoke's equations for convergence acceleration. Three of the Krylov subspace methods combined with the five of the preconditioners were tested to solve the lid-driven cavity flow problem. The MILU preconditioned CG method showed very fast and stable convergency. The combination of GMRES/MILU-CG solver for momentum and pressure correction equations was found less dependency on the number of the grid points among them. A guide line for stopping inner iterations for each equation is offered.

PARALLEL PERFORMANCE OF MULTISPLITTING METHODS WITH PREWEIGHTING

  • Han, Yu-Du;Yun, Jae-Heon
    • Journal of the Korean Mathematical Society
    • /
    • v.49 no.4
    • /
    • pp.805-827
    • /
    • 2012
  • In this paper, we first study convergence of a special type of multisplitting methods with preweighting, and then we provide some comparison results of those multisplitting methods. Next, we propose both parallel implementation of an SOR-like multisplitting method with preweighting and an application of the SOR-like multisplitting method with preweighting to a parallel preconditioner of Krylov subspace method. Lastly, we provide parallel performance results of both the SOR-like multisplitting method with preweighting and Krylov subspace method with the parallel preconditioner to evaluate parallel efficiency of the proposed methods.

Codebook design for subspace distribution clustering hidden Markov model (Subspace distribution clustering hidden Markov model을 위한 codebook design)

  • Cho, Young-Kyu;Yook, Dong-Suk
    • Proceedings of the KSPS conference
    • /
    • 2005.04a
    • /
    • pp.87-90
    • /
    • 2005
  • Today's state-of the-art speech recognition systems typically use continuous distribution hidden Markov models with the mixtures of Gaussian distributions. To obtain higher recognition accuracy, the hidden Markov models typically require huge number of Gaussian distributions. Such speech recognition systems have problems that they require too much memory to run, and are too slow for large applications. Many approaches are proposed for the design of compact acoustic models. One of those models is subspace distribution clustering hidden Markov model. Subspace distribution clustering hidden Markov model can represent original full-space distributions as some combinations of a small number of subspace distribution codebooks. Therefore, how to make the codebook is an important issue in this approach. In this paper, we report some experimental results on various quantization methods to make more accurate models.

  • PDF

Investigating SIR, DOC and SAVE for the Polychotomous Response

  • Lee, Hak-Bae;Lee, Hee-Min
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.501-506
    • /
    • 2012
  • This paper investigates the central subspace related with SIR, DOC and SAVE when the response has more than two values. The subspaces constructed by SIR, DOC and SAVE are investigated and compared. The SAVE paradigm is the most comprehensive. In addition, the SAVE coincides with the central subspace when the conditional distribution of predictors given the response is normally distributed.

Investigation of Efficiency of Starting Iteration Vectors for Calculating Natural Modes (고유모드 계산을 위한 초기 반복벡터의 효율성 연구)

  • Kim, Byoung-Wan;Kyoung, Jo-Hyun;Hong, Sa-Young;Cho, Seok-Kyu;Lee, In-Won
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.15 no.1 s.94
    • /
    • pp.112-117
    • /
    • 2005
  • Two modified versions of subspace iteration method using accelerated starting vectors are proposed to efficiently calculate free vibration modes of structures. Proposed methods employ accelerated Lanczos vectors as starting iteration vectors in order to accelerate the convergence of the subspace iteration method. Proposed methods are divided into two forms according to the number of starting vectors. The first method composes 2p starting vectors when the number of required modes is p and the second method uses 1.5p starting vectors. To investigate the efficiency of proposed methods, two numerical examples are presented.

Comparison of black and gray box models of subspace identification under support excitations

  • Datta, Diptojit;Dutta, Anjan
    • Structural Monitoring and Maintenance
    • /
    • v.4 no.4
    • /
    • pp.365-379
    • /
    • 2017
  • This paper presents a comparison of the black-box and the physics based derived gray-box models for subspace identification for structures subjected to support-excitation. The study compares the damage detection capabilities of both these methods for linear time invariant (LTI) systems as well as linear time-varying (LTV) systems by extending the gray-box model for time-varying systems using short-time windows. The numerically simulated IASC-ASCE Phase-I benchmark building has been used to compare the two methods for different damage scenarios. The efficacy of the two methods for the identification of stiffness parameters has been studied in the presence of different levels of sensor noise to simulate on-field conditions. The proposed extension of the gray-box model for LTV systems has been shown to outperform the black-box model in capturing the variation in stiffness parameters for the benchmark building.