• DocumentCode
    2709589
  • Title

    Computationally Efficient Estimators for Dimension Reductions Using Stable Random Projections

  • Author

    Li, Ping

  • Author_Institution
    Dept. of Stat. Sci., Cornell Univ., Ithaca, NY
  • fYear
    2008
  • fDate
    15-19 Dec. 2008
  • Firstpage
    403
  • Lastpage
    412
  • Abstract
    The method of stable random projections is an efficient tool for computing the lalpha distances using low memory, where 0 < alpha les 2 may be viewed as a tuning parameter. This method boils down to a statistical estimation task and various estimators have been proposed, based on the geometric mean, harmonic mean, and fractional power etc. This study proposes the optimal quantile estimator, whose main operation is selecting, which is considerably less expensive than taking fractional power, the main operation in previous estimators. Our experiments report that this estimator is nearly one order of magnitude more computationally efficient than previous estimators. For large-scale tasks in which storing and computing pairwise distances is a serious bottleneck, this estimator should be desirable. In addition to its computational advantage, the optimal quantile estimator exhibits nice theoretical properties. It is more accurate than previous estimators when alpha > 1. We derive its theoretical error bound and establish the explicit (i.e., no hidden constants) sample complexity bound.
  • Keywords
    data reduction; estimation theory; random processes; statistical analysis; dimension reduction; fractional power; geometric mean; harmonic mean; optimal quantile estimator; stable random projection; statistical estimation; tuning parameter; Costs; Data mining; Image color analysis; Information science; Large-scale systems; Machine learning; Machine learning algorithms; Power system harmonics; Streaming media; USA Councils;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
  • Conference_Location
    Pisa
  • ISSN
    1550-4786
  • Print_ISBN
    978-0-7695-3502-9
  • Type

    conf

  • DOI
    10.1109/ICDM.2008.95
  • Filename
    4781135