Title :
Parallel Computation of Modified Stahel-Donoho Estimators for Multivariate Outlier Detection
Author :
Wada, Kazuyoshi ; Tsubaki, Hiroe
Author_Institution :
Nat. Stat. Center, Tokyo, Japan
Abstract :
Modified Stahel-Donoho (MSD) estimators are an orthogonally equivariant multivariate outlier detection method with a high breakdown point for all dimensions. An R function of the MSD estimators is created and its performance is confirmed, however, the method suffers from the curse of dimensionality and its implementation is limited to relatively low dimensional datasets. This paper proposes a parallel computing approach to cope with higher dimensionality and presents results for a few datasets to illustrate its use. Code for both the utilized parallelized function and the original single-core function have been placed in a public repository for further evaluation.
Keywords :
Big Data; estimation theory; parallel algorithms; Big Data; MSD estimators; curse of dimensionality; low dimensional datasets; modified Stahel-Donoho estimators; multivariate outlier detection method; parallel algorithm; parallel computing approach; parallelized function; single-core function; Computational modeling; Correlation; Covariance matrices; Electric breakdown; Robustness; Software; Vectors; Mahalanobis distance; multivariate location and scatter; outlier detection; projection pursuit;
Conference_Titel :
Cloud Computing and Big Data (CloudCom-Asia), 2013 International Conference on
Conference_Location :
Fuzhou
Print_ISBN :
978-1-4799-2829-3
DOI :
10.1109/CLOUDCOM-ASIA.2013.86