• DocumentCode
    1746847
  • Title

    Visualization and knowledge discovery for high dimensional data

  • Author

    Inselberg, Alfred

  • Author_Institution
    Sch. of Math. Sci., Tel Aviv Univ., Israel
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    5
  • Lastpage
    24
  • Abstract
    The goal of the article is to present a multidimensional visualization methodology and its applications to visual and automatic knowledge discovery. Visualization provides insight through images and can be considered as a collection of application specific mappings: ProblemDomain→VisuaLRange. For the visualization of multivariate problems, a multidimensional system of parallel coordinates (||-coords) is constructed which induces a one-to-one mapping between subsets of N-space and subsets of 2-space. The result is a rigorous methodology for doing and seeing N-dimensional geometry. We start with an overview of the mathematical foundations where it is seen that from the display of high-dimensional datasets, the search for multivariate relations among the variables is transformed into a 2D pattern recognition problem. This is the basis for the application to visual knowledge discovery which is illustrated in the second part with a real dataset of VLSI production. Then a recent geometric classifier is presented and applied to 3 real datasets. The results compared to those of 23 other classifiers have the least error. The algorithm has quadratic computational complexity in the size and number of parameters, provides comprehensible and explicit rules, does dimensionality selection, and orders these variables so as to optimize the clarity of separation between the designated set and its complement. Finally a simple visual economic model of a real country is constructed and analyzed in order to illustrate the special strength of ||-coords in modeling multivariate relations by means of hypersurfaces
  • Keywords
    computational complexity; data mining; data visualisation; pattern classification; set theory; ∥-coords; 2D pattern recognition problem; N-dimensional geometry; VLSI production; application specific mappings; automatic knowledge discovery; dimensionality selection; geometric classifier; high dimensional data; high-dimensional datasets; hypersurfaces; mathematical foundations; multidimensional system; multidimensional visualization methodology; multivariate problems; multivariate relations; one-to-one mapping; parallel coordinates; quadratic computational complexity; real country; real dataset; rigorous methodology; simple visual economic model; visual knowledge discovery; Algorithm design and analysis; Computational complexity; Data visualization; Design optimization; Geometry; Multidimensional systems; Pattern recognition; Production; Two dimensional displays; Very large scale integration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    User Interfaces to Data Intensive Systems, 2001. UIDIS 2001. Proceedings. Second International Workshop on
  • Conference_Location
    Zurich
  • ISSN
    1530-1893
  • Print_ISBN
    0-7695-0834-0
  • Type

    conf

  • DOI
    10.1109/UIDIS.2001.929921
  • Filename
    929921