• Title/Summary/Keyword: dimension reduction subspace

Search Result 16, Processing Time 0.026 seconds

Tutorial: Dimension reduction in regression with a notion of sufficiency

  • Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.2
    • /
    • pp.93-103
    • /
    • 2016
  • In the paper, we discuss dimension reduction of predictors ${\mathbf{X}}{\in}{{\mathbb{R}}^p}$ in a regression of $Y{\mid}{\mathbf{X}}$ with a notion of sufficiency that is called sufficient dimension reduction. In sufficient dimension reduction, the original predictors ${\mathbf{X}}$ are replaced by its lower-dimensional linear projection without loss of information on selected aspects of the conditional distribution. Depending on the aspects, the central subspace, the central mean subspace and the central $k^{th}$-moment subspace are defined and investigated as primary interests. Then the relationships among the three subspaces and the changes in the three subspaces for non-singular transformation of ${\mathbf{X}}$ are studied. We discuss the two conditions to guarantee the existence of the three subspaces that constrain the marginal distribution of ${\mathbf{X}}$ and the conditional distribution of $Y{\mid}{\mathbf{X}}$. A general approach to estimate them is also introduced along with an explanation for conditions commonly assumed in most sufficient dimension reduction methodologies.

Tutorial: Methodologies for sufficient dimension reduction in regression

  • Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.2
    • /
    • pp.105-117
    • /
    • 2016
  • In the paper, as a sequence of the first tutorial, we discuss sufficient dimension reduction methodologies used to estimate central subspace (sliced inverse regression, sliced average variance estimation), central mean subspace (ordinary least square, principal Hessian direction, iterative Hessian transformation), and central $k^{th}$-moment subspace (covariance method). Large-sample tests to determine the structural dimensions of the three target subspaces are well derived in most of the methodologies; however, a permutation test (which does not require large-sample distributions) is introduced. The test can be applied to the methodologies discussed in the paper. Theoretical relationships among the sufficient dimension reduction methodologies are also investigated and real data analysis is presented for illustration purposes. A seeded dimension reduction approach is then introduced for the methodologies to apply to large p small n regressions.

On hierarchical clustering in sufficient dimension reduction

  • Yoo, Chaeyeon;Yoo, Younju;Um, Hye Yeon;Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.4
    • /
    • pp.431-443
    • /
    • 2020
  • The K-means clustering algorithm has had successful application in sufficient dimension reduction. Unfortunately, the algorithm does have reproducibility and nestness, which will be discussed in this paper. These are clear deficits for the K-means clustering algorithm; however, the hierarchical clustering algorithm has both reproducibility and nestness, but intensive comparison between K-means and hierarchical clustering algorithm has not yet been done in a sufficient dimension reduction context. In this paper, we rigorously study the two clustering algorithms for two popular sufficient dimension reduction methodology of inverse mean and clustering mean methods throughout intensive numerical studies. Simulation studies and two real data examples confirm that the use of hierarchical clustering algorithm has a potential advantage over the K-means algorithm.

A Note on Bootstrapping in Sufficient Dimension Reduction

  • Yoo, Jae Keun;Jeong, Sun
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.3
    • /
    • pp.285-294
    • /
    • 2015
  • A permutation test is the popular and attractive alternative to derive asymptotic distributions of dimension test statistics in sufficient dimension reduction methodologies; however, recent studies show that a bootstrapping technique also can be used. We consider two types of bootstrapping dimension determination, which are partial and whole bootstrapping procedures. Numerical studies compare the permutation test and the two bootstrapping procedures; subsequently, real data application is presented. Considering two additional bootstrapping procedures to the existing permutation test, one has more supporting evidence for the dimension estimation of the central subspace that allow it to be determined more convincingly.

Note on the estimation of informative predictor subspace and projective-resampling informative predictor subspace (다변량회귀에서 정보적 설명 변수 공간의 추정과 투영-재표본 정보적 설명 변수 공간 추정의 고찰)

  • Yoo, Jae Keun
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.5
    • /
    • pp.657-666
    • /
    • 2022
  • An informative predictor subspace is useful to estimate the central subspace, when conditions required in usual suffcient dimension reduction methods fail. Recently, for multivariate regression, Ko and Yoo (2022) newly defined a projective-resampling informative predictor subspace, instead of the informative predictor subspace, by the adopting projective-resampling method (Li et al. 2008). The new space is contained in the informative predictor subspace but contains the central subspace. In this paper, a method directly to estimate the informative predictor subspace is proposed, and it is compapred with the method by Ko and Yoo (2022) through theoretical aspects and numerical studies. The numerical studies confirm that the Ko-Yoo method is better in the estimation of the central subspace than the proposed method and is more efficient in sense that the former has less variation in the estimation.

Classification Using Sliced Inverse Regression and Sliced Average Variance Estimation

  • Lee, Hakbae
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.2
    • /
    • pp.275-285
    • /
    • 2004
  • We explore classification analysis using graphical methods such as sliced inverse regression and sliced average variance estimation based on dimension reduction. Some useful information about classification analysis are obtained by sliced inverse regression and sliced average variance estimation through dimension reduction. Two examples are illustrated, and classification rates by sliced inverse regression and sliced average variance estimation are compared with those by discriminant analysis and logistic regression.

Fused inverse regression with multi-dimensional responses

  • Cho, Youyoung;Han, Hyoseon;Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.3
    • /
    • pp.267-279
    • /
    • 2021
  • A regression with multi-dimensional responses is quite common nowadays in the so-called big data era. In such regression, to relieve the curse of dimension due to high-dimension of responses, the dimension reduction of predictors is essential in analysis. Sufficient dimension reduction provides effective tools for the reduction, but there are few sufficient dimension reduction methodologies for multivariate regression. To fill this gap, we newly propose two fused slice-based inverse regression methods. The proposed approaches are robust to the numbers of clusters or slices and improve the estimation results over existing methods by fusing many kernel matrices. Numerical studies are presented and are compared with existing methods. Real data analysis confirms practical usefulness of the proposed methods.

Iterative projection of sliced inverse regression with fused approach

  • Han, Hyoseon;Cho, Youyoung;Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.2
    • /
    • pp.205-215
    • /
    • 2021
  • Sufficient dimension reduction is useful dimension reduction tool in regression, and sliced inverse regression (Li, 1991) is one of the most popular sufficient dimension reduction methodologies. In spite of its popularity, it is known to be sensitive to the number of slices. To overcome this shortcoming, the so-called fused sliced inverse regression is proposed by Cook and Zhang (2014). Unfortunately, the two existing methods do not have the direction application to large p-small n regression, in which the dimension reduction is desperately needed. In this paper, we newly propose seeded sliced inverse regression and seeded fused sliced inverse regression to overcome this deficit by adopting iterative projection approach (Cook et al., 2007). Numerical studies are presented to study their asymptotic estimation behaviors, and real data analysis confirms their practical usefulness in high-dimensional data analysis.

Investigating SIR, DOC and SAVE for the Polychotomous Response

  • Lee, Hak-Bae;Lee, Hee-Min
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.501-506
    • /
    • 2012
  • This paper investigates the central subspace related with SIR, DOC and SAVE when the response has more than two values. The subspaces constructed by SIR, DOC and SAVE are investigated and compared. The SAVE paradigm is the most comprehensive. In addition, the SAVE coincides with the central subspace when the conditional distribution of predictors given the response is normally distributed.

On robustness in dimension determination in fused sliced inverse regression

  • Yoo, Jae Keun;Cho, Yoo Na
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.5
    • /
    • pp.513-521
    • /
    • 2018
  • The goal of sufficient dimension reduction (SDR) is to replace original p-dimensional predictors with a lower-dimensional linearly transformed predictor. The sliced inverse regression (SIR) (Li, Journal of the American Statistical Association, 86, 316-342, 1991) is one of the most popular SDR methods because of its applicability and simple implementation in practice. However, SIR may yield different dimension reduction results for different numbers of slices and despite its popularity, is a clear deficit for SIR. To overcome this, a fused sliced inverse regression was recently proposed. The study shows that the dimension-reduced predictors is robust to the numbers of the slices, but it does not investigate how robust its dimension determination is. This paper suggests a permutation dimension determination for the fused sliced inverse regression that is compared with SIR to investigate the robustness to the numbers of slices in the dimension determination. Numerical studies confirm this and a real data example is presented.