شماره ركورد كنفرانس :
3976
عنوان مقاله :
Common Factor Analysis
پديدآورندگان :
Wentzell Peter Wentzell@dal.ca Dalhousie University, Halifax, NS, B3H 4J3 Canada , Kompany-Zareh Mohsen Dalhousie University, Halifax, NS, B3H 4J3 Canada , Mabood Fazal University of Nizwa, Sultanate of Oman , Giglio Cannon Dalhousie University, Halifax, NS, B3H 4J3 Canada
كليدواژه :
Factor Analysis , Maximum Likelihood , Principal Axis Factoring , Noise , Heteroscedastic , Multivariate Normal.
عنوان كنفرانس :
ششمين سمينار ملي دوسالانه كمومتريكس ايران
چكيده فارسي :
Generally defined, factor analysis (FA) is any method that decomposes a data matrix (or,
in a more general case, a data tensor) into a bilinear (or multilinear) model of lower
dimensionality. Principal components analysis (PCA) is the most popular FA technique
in chemistry [1]. Another popular FA technique is a maximum likelihood based
common factor analysis (MLCFA or CFA) technique [2,3] which is available in the
Statistical toolbox of Matlab, as factoran.m. Although both are ML based, MLCFA is
different from MLPCA [4]. Principal axis factoring (PAF) is another common factor
analysis technique [5,6] that is similar to CFA in that it decomposes data into specific
factors in addition to common factors. This report is a comparison of CFA, PAF and
PCA regarding their resulting profiles and subspaces.
Simulated data sets including multivariate normal and non-multivariate normal
(chromatographic and spectral kinetic) data were considered, in addition to an
experimental data set including fatty acids of different groups of fish samples.
Different types of noise, including independent and identically distributed (iid), column
heteroscedastic and general heteroscedastic were added to the simulated data, in
different levels.
In presence of iid noise, results from CFA, PAF and PCA are almost the same and the
angle between calculated and true profile subspaces are higher when using
non-multivariate normal data. In the presence of heteroscedastic noise, the subspace of
CFA and PAF profiles are closer to that of the true profiles compared to PCA. In the
presence of heteroscedastic noise, CFA and PAF result in different profiles and
reconstruct the covariance matrix of data better than PCA, which assumes iid errors. In
the case of multivariate normal (MN) data, a likelihood and chi-squared based statistical
test can be applied for determining the optimum number of applied factors in the model.
A major advantage of using PAF over CFA is the possibility of using PAF for “fat” data
in which the number of samples (rows) is lower than the number of variables (columns).
MLPCA results in best reconstructions of data covariance and lowest angle between
subspaces of estimated and real profiles; however it needs the noise structure to be
completely known.