• DocumentCode
    2272533
  • Title

    Applying LSI and data reduction to XML for counter terrorism

  • Author

    Demurjian, S. ; Rajasekaran, Sanguthevar ; Ammar, R. ; Greenshields, I. ; Doan, T. ; He, L.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Connecticut Univ., Storrs, CT
  • fYear
    0
  • fDate
    0-0 0
  • Abstract
    Data reduction is a critical problem for counter-terrorism; large collections of documents must be analyzed and processed, raising issues related to performance, lossless reduction, polysemy (the meaning of individual words being influenced by their surrounding words), and synonymy (the possibility of the same term being described in different ways). In this paper, we begin by presenting a survey of latent semantic indexing (LSI) techniques and strategies. Next, we highlight a subset of LSI software packages that are available (commercially and academically). Then, we explore approaches that apply LSI to eXtensible Markup Language (XML) data. Using this as a basis, the paper proposes an approach that applies LSI and data reduction to XML documents by transitioning from support vector machines (SVM) to random projections to LSI, and also postulates on the exploitation of semantics of Web-based documents that are captured via XML tags
  • Keywords
    XML; data reduction; indexing; semantic Web; support vector machines; terrorism; Web-based documents; XML data; XML documents; XML tags; counterterrorism; data reduction; eXtensible Markup Language; latent semantic indexing; lossless reduction; polysemy; support vector machines; synonymy; Application software; Computer science; Counting circuits; Data engineering; Helium; Indexing; Large scale integration; Support vector machines; Terrorism; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Aerospace Conference, 2006 IEEE
  • Conference_Location
    Big Sky, MT
  • Print_ISBN
    0-7803-9545-X
  • Type

    conf

  • DOI
    10.1109/AERO.2006.1656047
  • Filename
    1656047