• Title of article

    Intrinsic dimension identification via graph-theoretic methods

  • Author/Authors

    Brito، نويسنده , , M.R. and Quiroz، نويسنده , , A.J. and Yukich، نويسنده , , J.E.، نويسنده ,

  • Issue Information
    دوفصلنامه با شماره پیاپی سال 2013
  • Pages
    15
  • From page
    263
  • To page
    277
  • Abstract
    Three graph theoretical statistics are considered for the problem of estimating the intrinsic dimension of a data set. The first is the “reach” statistic, r ¯ j , k , proposed in Brito et al. (2002)  [4] for the problem of identification of Euclidean dimension. The second, M n , is the sample average of squared degrees in the minimum spanning tree of the data, while the third statistic, U n k , is based on counting the number of common neighbors among the k -nearest, for each pair of sample points { X i , X j } , i < j ≤ n . For the first and third of these statistics, central limit theorems are proved under general assumptions, for data living in an m -dimensional C 1 submanifold of R d , and in this setting, we establish the consistency of intrinsic dimension identification procedures based on r ¯ j , k and U n k . For M n , asymptotic results are provided whenever data live in an affine subspace of Euclidean space. The graph theoretical methods proposed are compared, via simulations, with a host of recently proposed nearest neighbor alternatives.
  • Keywords
    intrinsic dimension , Graph theoretical methods , Dimensionality reduction , Stabilization methods
  • Journal title
    Journal of Multivariate Analysis
  • Serial Year
    2013
  • Journal title
    Journal of Multivariate Analysis
  • Record number

    1566208