• 제목/요약/키워드: principal components analysis

검색결과 760건 처리시간 0.035초

AN EFFICIENT ALGORITHM FOR SLIDING WINDOW BASED INCREMENTAL PRINCIPAL COMPONENTS ANALYSIS

  • Lee, Geunseop
    • 대한수학회지
    • /
    • 제57권2호
    • /
    • pp.401-414
    • /
    • 2020
  • It is computationally expensive to compute principal components from scratch at every update or downdate when new data arrive and existing data are truncated from the data matrix frequently. To overcome this limitations, incremental principal component analysis is considered. Specifically, we present a sliding window based efficient incremental principal component computation from a covariance matrix which comprises of two procedures; simultaneous update and downdate of principal components, followed by the rank-one matrix update. Additionally we track the accurate decomposition error and the adaptive numerical rank. Experiments show that the proposed algorithm enables a faster execution speed and no-meaningful decomposition error differences compared to typical incremental principal component analysis algorithms, thereby maintaining a good approximation for the principal components.

A New Deletion Criterion of Principal Components Regression with Orientations of the Parameters

  • Lee, Won-Woo
    • Journal of the Korean Statistical Society
    • /
    • 제16권2호
    • /
    • pp.55-70
    • /
    • 1987
  • The principal components regression is one of the substitues for least squares method when there exists multicollinearity in the multiple linear regression model. It is observed graphically that the performance of the principal components regression is strongly dependent upon the values of the parameters. Accordingly, a new deletion criterion which determines proper principal components to be deleted from the analysis is developed and its usefulness is checked by simulations.

  • PDF

Genetic Diversity of Soybean Pod Shape Based on Elliptic Fourier Descriptors

  • Truong Ngon T.;Gwag Jae-Gyun;Park Yong-Jin;Lee Suk-Ha
    • 한국작물학회지
    • /
    • 제50권1호
    • /
    • pp.60-66
    • /
    • 2005
  • Pod shape of twenty soybean (Glycine max L. Merrill) genotypes was evaluated quantitatively by image analysis using elliptic Fourier descriptors and their principal components. The closed contour of each pod projection was extracted, and 80 elliptic Fourier coefficients were calculated for each contour. The Fourier coefficients were standardized so that they were invariant of size, rotation, shift, and chain code starting point. Then, the principal components on the standardized Fourier coefficients were evaluated. The cumulative contribution at the fifth principal component was higher than $95\%$, indicating that the first, second, third, fourth, and fifth principal components represented the aspect ratio of the pod, the location of the pod centroid, the sharpness of the two pod tips and the roundness of the base in the pod contour, respectively. Analysis of variance revealed significant genotypic differences in these principal components and seed number per pod. As the principal components for pod shape varied continuously, pod shape might be controlled by polygenes. It was concluded that principal component scores based on elliptic Fourier descriptors yield seemed to be useful in quantitative parameters not only for evaluating soybean pod shape in a soybean breeding program but also for describing pod shape for evaluating soybean germplasm.

Application of varimax rotated principal component analysis in quantifying some zoometrical traits of a relict cow

  • Pares-Casanova, P.M.;Sinfreu, I.;Villalba, D.
    • 대한수의학회지
    • /
    • 제53권1호
    • /
    • pp.7-10
    • /
    • 2013
  • A study was conducted to determine the interdependence among the conformation traits of 28 "Pallaresa" cows using principal component analysis. Originally 21 body linear measurements were obtained, from which eight traits are subsequently eliminated. From the principal components analysis, with raw varimax rotation of the transformation matrix, two principal components were extracted, which accounted for 65.8% of the total variance. The first principal component alone explained 51.6% of the variation, and tended to describe general size, while the second principal component had its loadings for back-sternal diameter. The two extracted principal components, which are traits related to dorsal heights and back-sternal diameter, could be considered in selection programs.

주성분분석에 의한 재래종 옥수수의 해석 (Assessment and Classification of Korean Indigenous Corn Lines by Application of Principal Component Analysis)

  • 이인섭;박종옥
    • 생명과학회지
    • /
    • 제13권3호
    • /
    • pp.343-348
    • /
    • 2003
  • 육종재료를 얻기 위하여 부산·경남지역에서 수집된 재래종 옥수수 49 계통을 선발하여 본 실험을 실시하였다. 본 시료는 주성분분석을 이용하여 재래종 옥수수를 해석하고 계통분류를 실시하였던 바 다음과 같은 결과를 얻었다. 7 개의 형질을 이용하여 실시한 주성분분석에서는 제 4주성분까지를 가지고 전체 변동의 86.3%를 설명할 수 있었고, 제 2 주성분까지는 전체 변동의 67.4%를 설명할 수 있었다. 주성분에 대한 형질들의 기여율은 형질에 따라 달랐고 상위 주성분에서 켰으며 하위 주성분에서 작았다. 주성분과 형질과의 상관계수는 주성분의 생물학적 의의와 주성분에 대응한 식물체의 형을 명확히 하였는데 제 1 주성분은 식물체의 크기 및 생장기간에 관련된 주성분이었고, 제2주성분은 이삭수와 분얼수에 관련된 주성분이었다. 제 3주성분과 제 4 주성분에서는 형질간에는 유의성이 인정되지 않았다.

피복 구성을 위한 경부 형태의 관찰 (Observation on the shape of the neck -by principal component analysis of the mesurements-)

  • 이연순
    • 대한인간공학회지
    • /
    • 제10권2호
    • /
    • pp.31-42
    • /
    • 1991
  • To understand the shape of the neck in a view of garment planning, principal component analysis has been appliedto the measurement of the neck. The neck surface development and the cross sections of the neck have been observed. The materials consist of the body mearsurements, the neck surface developments and the cross sec- tions of the necks of a total of 108 korean woman students. The difference between the right side and the left side of the neck has not been reconginiged. But the differenece among the height of the front neck point, that of the side neck point and that of the back neck point has been recognized. 2. The initial 41 items have been found having variety and duplication. So two criteria have been made to solve those problems and the selection of 34 items have been made by each criterion. 3. 43 and 34 items have been compared by means of accumulative ratios of contribution and of clearness within the meaning of principal component. As a result, 34 measurement items have been further anylysis. 4. As a result of principal component analysis on the 34 items, the four principal components have been found obtaines and inter-preted. The four principal components are 1) the thick of the neck, 2) the front neck-line on the waist basic pattern, basic pattern, 3) the shape of the neck surface development, and 4) the back neck-line on the waist basic pattern. 5. According to the graphic informations concerning these principal components, the meaning of these four principal components has been grasped on the visual. As a result, there is a large individual difference in the shape of neck.

  • PDF

A Taxonomy of Korean Isopyroideae (Ranunculaceae)

  • Lee, Nam-Sook;Yeau, Sung-Hee
    • Animal cells and systems
    • /
    • 제2권4호
    • /
    • pp.439-449
    • /
    • 1998
  • To discuss the taxonomic dispositions of Korean Isopyroideae (Ranunculaceae) taxa, principal components analysis and cluster analysis were performed using quantitative and qualitative morphological characters. The principal components analysis revealed that the size and number of ovule, ovary width, ratio of style length/ovary length, filament length, sepal size, style length, leaf size, and ovary length are important characters to distinguish Korean Isopyroideae taxa. The cluster and principal components analyses based on both quantitative and quantitative characters demonstrate that lsopyrum mandshuricum is more closely related to Enemion raddeanum than to Semiaquilegia adoxoides. Even though Enemion s not separated from Isopyrum by uantitative characters, they are distinguished by qualitative characters, suggesting that our taxa, Enemion, Semiaquilegia, Isopyrum and Aquilegia, should be recognized in Korean Isopyroideae. In addition, cluster analyses suggest that S. adoxoides could be separated from Aquilegia buergeriana var, oxysepala.

  • PDF

라소를 이용한 간편한 주성분분석 (Simple principal component analysis using Lasso)

  • 박철용
    • Journal of the Korean Data and Information Science Society
    • /
    • 제24권3호
    • /
    • pp.533-541
    • /
    • 2013
  • 이 연구에서는 라소를 이용한 간편한 주성분분석을 제안한다. 이 방법은 다음의 두 단계로 구성되어 있다. 먼저 주성분분석에 의해 주성분을 구한다. 다음으로 각 주성분을 반응변수로 하고 원자료를 설명변수로 하는 라소 회귀모형에 의한 회귀계수 추정량을 구한다. 이 회귀계수 추정량에 기반한 새로운 주성분을 사용한다. 이 방법은 라소 회귀분석의 성질에 의해 회귀계수 추정량이 보다 쉽게 0이 될 수 있기 때문에 해석이 쉬운 장점이 있다. 왜냐하면 주성분을 반응변수로 하고 원자료를 설명변수로 하는 회귀모형의 회귀계수가 고유벡터가 되기 때문이다. 라소 회귀모형을 위한 R 패키지를 이용하여 모의생성된 자료와 실제 자료에 이 방법을 적용하여 유용성을 보였다.

주성분 분석을 위한 새로운 EM 알고리듬 (New EM algorithm for Principal Component Analysis)

  • 안종훈;오종훈
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2001년도 봄 학술발표논문집 Vol.28 No.1 (B)
    • /
    • pp.529-531
    • /
    • 2001
  • We present an expectation-maximization algorithm for principal component analysis via orthogonalization. The algorithm finds actual principal components, whereas previously proposed EM algorithms can only find principal subspace. New algorithm is simple and more efficient thant probabilistic PCA specially in noiseless cases. Conventional PCA needs computation of inverse of the covariance matrices, which makes the algorithm prohibitively expensive when the dimensions of data space is large. This EM algorithm is very powerful for high dimensional data when only a few principal components are needed.

  • PDF

주성분회귀분석에서 주성분선정을 위한 새로운 방법 (Procedure for the Selection of Principal Components in Principal Components Regression)

  • 김부용;신명희
    • 응용통계연구
    • /
    • 제23권5호
    • /
    • pp.967-975
    • /
    • 2010
  • 데이터마이닝 분야에서의 회귀모형에는 연관성이 높은 설명변수들이 포함되어 다중공선성을 유발하는 경우가 많은데, 다중공선성이 야기하는 문제를 해결하기 위하여 주성분회귀분석을 적용할 수 있다. 이 분석에서는 적절한 주성분을 선정하는 과정이 핵심인데, 기존의 선정방법들은 다중공선성을 잘 해결하지 못하거나 모형의 적합성을 저하시킨다는 지적을 받고 있다. 따라서 본 논문에서는 다중공선성 문제와 적합성 저하 현상을 동시에 해결할 수 있는 새로운 선정방법을 제안하였다. 다중공선성에 의해 최소제곱추정량의 분산이 팽창되는 문제를 주성분회귀에 의해 해결할 수 있지만, 주성분의 일부를 선정함에 따라 발생하는 편의도 동시에 통제해야 한다. 따라서 주성분회귀추정량의 평균제곱오차를 최소가 되게 하는 상태지수를 측정하고, 이 값에 영향을 미치는 주요 요인들을 컨조인트분석에 의해 파악하여 주성분 선정기준 모형을 구축하였다. 선정기준의 상한과 하한을 설정하고, 상태지수가 상한을 초과하면 해당 주성분을 제외시키고, 하한에 미달하면 해당 주성분을 포함시킨다. 그리고 상한과 하한 사이의 상태지수에 대응하는 주성분들에 대해서는 일반화선형검정을 순차적으로 적용하여 주성분을 선정하는 방법이다.