DocumentCode :
2219717
Title :
Identifying variable-length meaningful phrases with correlation functions
Author :
Kim, Hyoung-Rae ; Chan, Philip K.
Author_Institution :
Dept. of Comput. Sci., Florida Inst. of Technol., Melbourne, FL, USA
fYear :
2004
fDate :
15-17 Nov. 2004
Firstpage :
30
Lastpage :
38
Abstract :
Finding meaningful phrases in a document has been studied in various information retrieval systems in order to improve the performance. Many previous statistical phrase-finding methods had a different aim such as document classification. Some are hybridized with statistical and syntactic grammatical methods; others use correlation heuristics between words. We propose a new phrase-finding algorithm that adds correlated words one by one to the phrases found in the previous stage, maintaining high correlation within a phrase. Our results indicate that our algorithm finds more meaningful phrases than an existing algorithm. Furthermore, the previous algorithm could be improved by applying different correlation junctions.
Keywords :
computational complexity; document handling; information retrieval; information retrieval systems; statistical analysis; correlation heuristics; information retrieval system; statistical method; syntactic grammatical method; time complexity; variable-length phrase-finding algorithm; Algorithm design and analysis; Artificial intelligence; Clustering algorithms; Data mining; Frequency; Humans; Information retrieval; Performance analysis; Probability; Robustness;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2004. ICTAI 2004. 16th IEEE International Conference on
ISSN :
1082-3409
Print_ISBN :
0-7695-2236-X
Type :
conf
DOI :
10.1109/ICTAI.2004.70
Filename :
1374167
Link To Document :
بازگشت