DocumentCode :
3038745
Title :
Telcordia LSI Engine: implementation and scalability issues
Author :
Chen, Chung-Min ; Stoffel, Ned ; Post, Mike ; Basu, Chumki ; Bassu, Devasis ; Behrens, Clifford
Author_Institution :
Telcordia Technol. Inc., Morristown, NJ, USA
fYear :
2001
fDate :
2001
Firstpage :
51
Lastpage :
58
Abstract :
Latent Semantic Indexing (LSI), a vector space-based approach to information retrieval, has been proven to be an effective tool in correlating and retrieving relevant documents. While much work has been published on LSI, most of it addresses the algorithmic or theoretical basis of the model. Little, if any, presents implementation issues in practice. We describe a production-level implementation of LSI. The system integrates components including document collection and preprocessing, singular value decomposition (SVD), multilingual processing, and a tree-based access method for similarity querying. We discuss implementation issues encountered during the development of the system. In particular, we address scalability issues in the query engine and various components of the system, and present lessons learned
Keywords :
indexing; information retrieval; search engines; singular value decomposition; Latent Semantic Indexing; Telcordia LSI Engine; document collection; information retrieval; multilingual processing; relevant document retrieval; scalability; similarity querying; singular value decomposition; tree-based access method; vector space-based approach; Engines; Indexing; Information retrieval; Large scale integration; Multidimensional systems; Optimized production technology; Scalability; Singular value decomposition; Space technology; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Research Issues in Data Engineering, 2001. Proceedings. Eleventh International Workshop on
Conference_Location :
Heidelberg
Print_ISBN :
0-7695-0957-6
Type :
conf
DOI :
10.1109/RIDE.2001.916491
Filename :
916491
Link To Document :
بازگشت