Title :
Multipath querying of hierarchically tree structured document databases in vector spaces
Author :
Ayan, UgUr ; Bayazit, U. ; Gurgen, Fikret
Author_Institution :
Bogazici Univ., Istanbul, Turkey
Abstract :
The representation of large document databases, consisting of Web pages, articles and book and magazine titles, in terms of matrices for the purpose of text querying and retrieval simplifies and expedites the querying process. In the literature, dimensionality reduction techniques based on singular value decomposition and principal component analysis have been proposed to reduce the high computational complexity resulting from the use of high dimension matrices and vectors. Serkan Kaya et al. (2002) proposed the organization of the text database in the form of a hierarchical tree structure, and single path and multipath querying over this structure, as a technique to reduce the computational complexity in addition to dimensionality reduction. We analyze and compare the tradeoff between the computational complexity and the performance of the static and adaptive multipath querying methods by varying the number of paths.
Keywords :
computational complexity; query processing; tree data structures; tree searching; trees (mathematics); very large databases; adaptive multipath querying; computational complexity; dimensionality reduction techniques; hierarchical tree structure; matrices; principal component analysis; single path querying; singular value decomposition; text querying; text retrieval; tree structured document databases; vector spaces; Books; Computational complexity; Databases; Information retrieval; Matrix decomposition; Performance analysis; Principal component analysis; Singular value decomposition; Tree data structures; Web pages;
Conference_Titel :
Signal Processing and Communications Applications Conference, 2004. Proceedings of the IEEE 12th
Print_ISBN :
0-7803-8318-4
DOI :
10.1109/SIU.2004.1338605