DocumentCode
2404994
Title
Towards meaningful high-dimensional nearest neighbor search by human-computer interaction
Author
Aggarwal, Charu C.
Author_Institution
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
fYear
2002
fDate
2002
Firstpage
593
Lastpage
604
Abstract
Nearest neighbor search is an important and widely used problem in a number of important application domains. In many of these domains, the dimensionality of the data representation is often very high. Recent theoretical results have shown that the concept of proximity or nearest neighbors may not be very meaningful for the high dimensional case. Therefore, it is often a complex problem to find good quality nearest neighbors in such data sets. Furthermore, it is also difficult to judge the value and relevance of the returned results. In fact, it is hard for any fully automated system to satisfy a user about the quality of the nearest neighbors found unless he is directly involved in the process. This is especially the case for high dimensional data in which the meaningfulness of the nearest neighbors found is questionable. We address the complex problem of high dimensional nearest neighbor search from the user perspective by designing a system which uses effective cooperation between the human and the computer. The system provides the user with visual representations of carefully chosen subspaces of the data in order to repeatedly elicit his preferences about the data patterns which are most closely related to the query point. These preferences are used in order to determine and quantify the meaningfulness of the nearest neighbors. Our system is not only able to find and quantify the meaningfulness of the nearest neighbors, but is also able to diagnose situations in which the nearest neighbors found are truly not meaningful
Keywords
data mining; data structures; database management systems; query processing; user interfaces; data mining; data patterns; data representation; data sets; databases; high dimensional data; high-dimensional nearest neighbor search; human-computer interaction; query point; visual representations; Data engineering; Data mining; Data structures; Humans; Information retrieval; Multimedia databases; Nearest neighbor searches; Spatial databases;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering, 2002. Proceedings. 18th International Conference on
Conference_Location
San Jose, CA
ISSN
1063-6382
Print_ISBN
0-7695-1531-2
Type
conf
DOI
10.1109/ICDE.2002.994777
Filename
994777
Link To Document