Title :
Computationally Efficient Estimators for Dimension Reductions Using Stable Random Projections
Author_Institution :
Dept. of Stat. Sci., Cornell Univ., Ithaca, NY
Abstract :
The method of stable random projections is an efficient tool for computing the lalpha distances using low memory, where 0 < alpha les 2 may be viewed as a tuning parameter. This method boils down to a statistical estimation task and various estimators have been proposed, based on the geometric mean, harmonic mean, and fractional power etc. This study proposes the optimal quantile estimator, whose main operation is selecting, which is considerably less expensive than taking fractional power, the main operation in previous estimators. Our experiments report that this estimator is nearly one order of magnitude more computationally efficient than previous estimators. For large-scale tasks in which storing and computing pairwise distances is a serious bottleneck, this estimator should be desirable. In addition to its computational advantage, the optimal quantile estimator exhibits nice theoretical properties. It is more accurate than previous estimators when alpha > 1. We derive its theoretical error bound and establish the explicit (i.e., no hidden constants) sample complexity bound.
Keywords :
data reduction; estimation theory; random processes; statistical analysis; dimension reduction; fractional power; geometric mean; harmonic mean; optimal quantile estimator; stable random projection; statistical estimation; tuning parameter; Costs; Data mining; Image color analysis; Information science; Large-scale systems; Machine learning; Machine learning algorithms; Power system harmonics; Streaming media; USA Councils;
Conference_Titel :
Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
Conference_Location :
Pisa
Print_ISBN :
978-0-7695-3502-9
DOI :
10.1109/ICDM.2008.95