Title :
An architecture for streaming coclustering in high speed hardware
Author :
Byrnes, John ; Rohwer, Richard
Author_Institution :
HNC Software, LLC, Fair Isaac Corp., San Diego, CA
Abstract :
We seek to learn the semantics of a data stream at optical line speed. We focus on text data, but the techniques developed should apply to broad modalities of network data wherever appropriate features can be computed rapidly enough. We consider a custom hardware system designed to categorize documents based on feature clusters and document clusters that have been learned offline on standard general-purpose computers, and we present a technique for extending this system to permit online learning from arbitrarily large data sets
Keywords :
data mining; feature extraction; pattern clustering; text analysis; data stream semantics; document categorization; document clusters; feature clusters; high speed hardware; network data; online learning; streaming coclustering; text data; Biomedical optical imaging; Clustering algorithms; Computer architecture; Computer networks; Hardware; High speed optical techniques; Intelligent networks; Optical fiber networks; Software algorithms; Vocabulary;
Conference_Titel :
Aerospace Conference, 2006 IEEE
Conference_Location :
Big Sky, MT
Print_ISBN :
0-7803-9545-X
DOI :
10.1109/AERO.2006.1656051