• DocumentCode
    1950656
  • Title

    Intrinsic dimension of a dataset: what properties does one expect?

  • Author

    Pestov, Vladimir

  • Author_Institution
    Univ. of Ottawa, Ottawa
  • fYear
    2007
  • fDate
    12-17 Aug. 2007
  • Firstpage
    2959
  • Lastpage
    2964
  • Abstract
    We propose an axiomatic approach to the concept of an intrinsic dimension of a dataset, based on a viewpoint of geometry of high-dimensional structures. Our first axiom postulates that high values of dimension be indicative of the presence of the curse of dimensionality (in a certain precise mathematical sense). The second axiom requires the dimension to depend smoothly on a distance between datasets (so that the dimension of a dataset and that of an approximating principal manifold would be close to each other). The third axiom is a normalization condition: the dimension of the Euclidean n-sphere Sn is Theta(n). We give an example of a dimension function satisfying our axioms, even though it is in general computationally unfeasible, and discuss a computationally cheap function satisfying most but not all of our axioms (the "intrinsic dimensionality" of Chavez et al.)
  • Keywords
    data structures; Euclidean sphere; axiomatic approach; dataset intrinsic dimension; dimension function; geometry; high-dimensional structures; principal manifold approximation; Data engineering; Extraterrestrial measurements; Geometry; Mathematical model; Mathematics; Neural networks; Probability distribution; Solid modeling; Statistical analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2007. IJCNN 2007. International Joint Conference on
  • Conference_Location
    Orlando, FL
  • ISSN
    1098-7576
  • Print_ISBN
    978-1-4244-1379-9
  • Electronic_ISBN
    1098-7576
  • Type

    conf

  • DOI
    10.1109/IJCNN.2007.4371431
  • Filename
    4371431