DocumentCode :
2272533
Title :
Applying LSI and data reduction to XML for counter terrorism
Author :
Demurjian, S. ; Rajasekaran, Sanguthevar ; Ammar, R. ; Greenshields, I. ; Doan, T. ; He, L.
Author_Institution :
Dept. of Comput. Sci. & Eng., Connecticut Univ., Storrs, CT
fYear :
0
fDate :
0-0 0
Abstract :
Data reduction is a critical problem for counter-terrorism; large collections of documents must be analyzed and processed, raising issues related to performance, lossless reduction, polysemy (the meaning of individual words being influenced by their surrounding words), and synonymy (the possibility of the same term being described in different ways). In this paper, we begin by presenting a survey of latent semantic indexing (LSI) techniques and strategies. Next, we highlight a subset of LSI software packages that are available (commercially and academically). Then, we explore approaches that apply LSI to eXtensible Markup Language (XML) data. Using this as a basis, the paper proposes an approach that applies LSI and data reduction to XML documents by transitioning from support vector machines (SVM) to random projections to LSI, and also postulates on the exploitation of semantics of Web-based documents that are captured via XML tags
Keywords :
XML; data reduction; indexing; semantic Web; support vector machines; terrorism; Web-based documents; XML data; XML documents; XML tags; counterterrorism; data reduction; eXtensible Markup Language; latent semantic indexing; lossless reduction; polysemy; support vector machines; synonymy; Application software; Computer science; Counting circuits; Data engineering; Helium; Indexing; Large scale integration; Support vector machines; Terrorism; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Aerospace Conference, 2006 IEEE
Conference_Location :
Big Sky, MT
Print_ISBN :
0-7803-9545-X
Type :
conf
DOI :
10.1109/AERO.2006.1656047
Filename :
1656047
Link To Document :
بازگشت