Title :
A weakly supervised optimize method in latent semantic indexing
Author :
Ji, Duo ; Guo, Dongbo ; Cai, Dongfeng ; Bai, Yu
Author_Institution :
Knowledge Eng. Res. Center, Shenyang Inst. of Aeronaut. Eng., Shenyang, China
Abstract :
Latent Semantic Indexing (LSI) is an effective method in the way of feature extraction, which has been applied to many text learning tasks, such as text clustering and information retrieval. This paper thoroughly analyses the influence of term co-occurrences on the mapping of Latent Semantic Indexing and brings forward a method named pseudo document which strengthens the beneficial term co-occurrences by adding heuristic knowledge to text collection so as to make the mapping of Latent Semantic Indexing more reasonable. The experimental results show that the method named pseudo document can effectively improve the performance of patent retrieval.
Keywords :
computational linguistics; feature extraction; indexing; information retrieval; learning (artificial intelligence); text analysis; feature extraction; heuristic knowledge; information retrieval; latent semantic indexing; pseudo document method; term co-occurrence; text clustering; text learning task; weakly supervised optimize method; Aerospace engineering; Computer aided instruction; Feature extraction; Indexing; Information retrieval; Knowledge engineering; Large scale integration; Least squares approximation; Matrix decomposition; Optimization methods; Latent Semantic Indexing; Patent Retrieval; Pseudo Document; Term Co-occurrence;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009. International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4244-4538-7
Electronic_ISBN :
978-1-4244-4540-0
DOI :
10.1109/NLPKE.2009.5313756