• Title/Summary/Keyword: multivariate analysis

Search Result 3,119, Processing Time 0.027 seconds

Canonical Correlation Biplot

  • Park, Mi-Ra;Huh, Myung-Hoe
    • Communications for Statistical Applications and Methods
    • /
    • v.3 no.1
    • /
    • pp.11-19
    • /
    • 1996
  • Canonical correlation analysis is a multivariate technique for identifying and quantifying the statistical relationship between two sets of variables. Like most multivariate techniques, the main objective of canonical correlation analysis is to reduce the dimensionality of the dataset. It would be particularly useful if high dimensional data can be represented in a low dimensional space. In this study, we will construct statistical graphs for paired sets of multivariate data. Specifically, plots of the observations as well as the variables are proposed. We discuss the geometric interpretation and goodness-of-fit of the proposed plots. We also provide a numerical example.

  • PDF

Rank Tests for Multivariate Linear Models in the Presence of Missing Data

  • Lee, Jae-Won;David M. Reboussin
    • Journal of the Korean Statistical Society
    • /
    • v.26 no.3
    • /
    • pp.319-332
    • /
    • 1997
  • The application of multivariate linear rank statistics to data with item nonresponse is considered. Only a modest extension of the complete data techniques is required when the missing data may be thought of as a random sample, and an appropriate modification of the covariances is derived. A proof of the asymptotic multivariate normality is given. A review of some related results in the literature is presented and applications including longitudinal and repeated measures designs are discussed.

  • PDF

Analyzing Operation Deviation in the Deasphalting Process Using Multivariate Statistics Analysis Method

  • Park, Joo-Hwang;Kim, Jong-Soo;Kim, Tai-Suk
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.7
    • /
    • pp.858-865
    • /
    • 2014
  • In the case of system like MES, various sensors collect the data in real time and save it as a big data to monitor the process. However, if there is big data mining in distributed computing system, whole processing process can be improved. In this paper, system to analyze the cause of operation deviation was built using the big data which has been collected from deasphalting process at the two different plants. By applying multivariate statistical analysis to the big data which has been collected through MES(Manufacturing Execution System), main cause of operation deviation was analyzed. We present the example of analyzing the operation deviation of deasphalting process using the big data which collected from MES by using multivariate statistics analysis method. As a result of regression analysis of the forward stepwise method, regression equation has been found which can explain 52% increase of performance compare to existing model. Through this suggested method, the existing petrochemical process can be replaced which is manual analysis method and has the risk of being subjective according to the tester. The new method can provide the objective analysis method based on numbers and statistic.

A Goodness-of-Fit Test for Multivariate Normal Distribution Using Modified Squared Distance

  • Yim, Mi-Hong;Park, Hyun-Jung;Kim, Joo-Han
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.4
    • /
    • pp.607-617
    • /
    • 2012
  • The goodness-of-fit test for multivariate normal distribution is important because most multivariate statistical methods are based on the assumption of multivariate normality. We propose goodness-of-fit test statistics for multivariate normality based on the modified squared distance. The empirical percentage points of the null distribution of the proposed statistics are presented via numerical simulations. We compare performance of several test statistics through a Monte Carlo simulation.

A Comparison Study of Multivariate Binary and Continuous Outcomes

  • Pak, Dae-Woo;Cho, Hyung-Jun
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.4
    • /
    • pp.605-612
    • /
    • 2012
  • Multivariate data are often generated with multiple outcomes in various fields. Multiple outcomes could be mixed as continuous and discrete. Because of their complexity, the data are often dealt with by separately applying regression analysis to each outcome even though they are associated the each other. This univariate approach results in the low efficiency of estimates for parameters. We study the efficiency gains of the multivariate approaches relative to the univariate approach with the mixed data that include continuous and binary outcomes. All approaches yield consistent estimates for parameters with complete data. By jointly estimating parameters using multivariate methods, it is generally possible to obtain more accurate estimates for parameters than by a univariate approach. The association between continuous and binary outcomes creates a gap in efficiency between multivariate and univariate approaches. We provide a guidance to analyze the mixed data.

Detecting Influential Observations in Multivariate Statistical Analysis of Incomplete Data by PCA (주성분분석에 의한 결손 자료의 영향값 검출에 대한 연구)

  • 김현정;문승호;신재경
    • The Korean Journal of Applied Statistics
    • /
    • v.13 no.2
    • /
    • pp.383-392
    • /
    • 2000
  • Since late 1970, methods of influence or sensitivity analysis for detecting influential observations have been studied not only in regression and related methods but also in various multivariate methods. If results of multivariate analyses sometimes depend heavily on a small number of observations, we should be very careful to draw a conclusion. Similar phenomena may also occur in the case of incomplete data. In this research we try to study such influential observations in multivariate statistical analysis of incomplete data. Case of principal component analysis is studied with a numerical example.

  • PDF

Multivariate Statistical Analysis and Prediction for the Flash Points of Binary Systems Using Physical Properties of Pure Substances (순수 성분의 물성 자료를 이용한 2성분계 혼합물의 인화점에 대한 다변량 통계 분석 및 예측)

  • Lee, Bom-Sock;Kim, Sung-Young
    • Journal of the Korean Institute of Gas
    • /
    • v.11 no.3
    • /
    • pp.13-18
    • /
    • 2007
  • The multivariate statistical analysis, using the multiple linear regression(MLR), have been applied to analyze and predict the flash points of binary systems. Prediction for the flash points of flammable substances is important for the examination of the fire and explosion hazards in the chemical process design. In this paper, the flash points are predicted by MLR based on the physical properties of pure substances and the experimental flash points data. The results of regression and prediction by MLR are compared with the values calculated by Raoult's law and Van Laar equation.

  • PDF

A Classification of Regional Pattern Analysis for the Planning in Chungbuk using Multivariate Analysis (다변량분석법을 이용한 충청북도 읍면단위 농촌계획 수립을 위한 지역유형구분 분석)

  • Yoon, Seong-Soo;Joo, Ho-Gil
    • Journal of Korean Society of Rural Planning
    • /
    • v.11 no.2 s.27
    • /
    • pp.35-41
    • /
    • 2005
  • It is necessary that the basic concept of rural planning update from economics based on the production and sale into experience of natural resources and traditional culture. For the purpose of set up development direction for rural district, it is requisite to the multivariate analysis. In this study, the methods of the classification of rural village with existing data are studied, the results looking for applying to the making of principal viewpoint of the development. The analysis methods of classification are used the PCA, CA and combination of these, and making the revised method for localization of the rural district. In this study, we implement classification of regional pattern analysis for the planning of rural district in Chungbuk province.

Analysis of Multivariate Process Capability Using Box-Cox Transformation (Box-Cox변환을 이용한 다변량 공정능력 분석)

  • Moon, Hye-Jin;Chung, Young-Bae
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.42 no.2
    • /
    • pp.18-27
    • /
    • 2019
  • The process control methods based on the statistical analysis apply the analysis method or mathematical model under the assumption that the process characteristic is normally distributed. However, the distribution of data collected by the automatic measurement system in real time is often not followed by normal distribution. As the statistical analysis tools, the process capability index (PCI) has been used a lot as a measure of process capability analysis in the production site. However, PCI has been usually used without checking the normality test for the process data. Even though the normality assumption is violated, if the analysis method under the assumption of the normal distribution is performed, this will be an incorrect result and take a wrong action. When the normality assumption is violated, we can transform the non-normal data into the normal data by using an appropriate normal transformation method. There are various methods of the normal transformation. In this paper, we consider the Box-Cox transformation among them. Hence, the purpose of the study is to expand the analysis method for the multivariate process capability index using Box-Cox transformation. This study proposes the multivariate process capability index to be able to use according to both methodologies whether data is normally distributed or not. Through the computational examples, we compare and discuss the multivariate process capability index between before and after Box-Cox transformation when the process data is not normally distributed.

Multivariate analysis of longitudinal surveys for population median

  • Priyanka, Kumari;Mittal, Richa
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.3
    • /
    • pp.255-269
    • /
    • 2017
  • This article explores the analysis of longitudinal surveys in which same units are investigated on several occasions. Multivariate exponential ratio type estimator has been proposed for the estimation of the finite population median at the current occasion in two occasion longitudinal surveys. Information on several additional auxiliary variables, which are stable over time and readily available on both the occasions, has been utilized. Properties of the proposed multivariate estimator, including the optimum replacement strategy, are presented. The proposed multivariate estimator is compared with the sample median estimator when there is no matching from a previous occasion and with the exponential ratio type estimator in successive sampling when information is available on only one additional auxiliary variable. The merits of the proposed estimator are justified by empirical interpretations and validated by a simulation study with the help of some natural populations.