Title :
Hardware accelerated algorithms for semantic processing of document streams
Author :
Eick, Stephen G. ; Lockwood, John W. ; Loui, Ron ; Levine, Andrew ; Mauger, Justin ; Weishar, Doyle J. ; Ratner, Alan ; Byrnes, J.
Author_Institution :
SSS Res., Inc., Naperville, IL
Abstract :
There is a need within the intelligence communities to analyze massive streams of multilingual unstructured data. Mathematical transformation algorithms have proven effective at interpreting multilingual, unstructured data, but high computational requirements of such algorithms prevent their widespread use. The rate of computation can be vastly increased with field programmable gate array (FPGA) hardware. To experiment with this approach, we developed a system with FPGAs that ingests content over a network at high data rates. The system extracts basewords, counts words, scores documents, and discovers concepts on data that are carried in TCP/IP network flows as packets over a Gigabit Ethernet link or in cells transported over an OC48 link. These algorithms, as implemented in FPGA hardware, introduce certain constraints on the complexity and richness of the semantic processing algorithms. To understand the implications of these constraints and to benchmark the performance of the system, we have performed a series of experiments processing multilingual documents. In these experiments, we compare techniques to generate basewords for our semantic concepts, score documents, and discover concepts across a variety of processing operational scenarios
Keywords :
field programmable gate arrays; local area networks; natural languages; text analysis; FPGA hardware; Gigabit Ethernet link; OC48 link; TCP/IP network flows; baseword extraction; document scoring; document streams; field programmable gate array; hardware accelerated algorithms; multilingual document processing; multilingual unstructured data; semantic concepts; semantic processing; word counting; Acceleration; Computational intelligence; Computational linguistics; Data mining; Field programmable gate arrays; Hardware; National security; Sorting; Testing; Text analysis;
Conference_Titel :
Aerospace Conference, 2006 IEEE
Conference_Location :
Big Sky, MT
Print_ISBN :
0-7803-9545-X
DOI :
10.1109/AERO.2006.1656050