DocumentCode :
2594519
Title :
CASIT: Content Based Identification of Textual Information in a Large Database
Author :
Guezouli, Larbi ; Essafi, Hassane
Author_Institution :
Comput. Sci. Dept., Batna Univ., Batna, Algeria
fYear :
2010
fDate :
20-23 April 2010
Firstpage :
621
Lastpage :
625
Abstract :
This paper describes CASIT model (CAlculation of SImilarity of Text). Starting from a coarse confrontation of text documents, based on the Latent Semantic Indexing model (LSI), CASIT method calculates in a finer way, the rate of similarity between model documents of text and others which are confronted to them. Our approach takes into account the neighbourhood of the words, which makes it possible to balance the words in the calculation of the score.
Keywords :
text analysis; CASIT model; calculation of similarity of text; content based identification; latent semantic indexing model; text documents; textual information; Application software; Computer science; Conferences; Databases; Filters; Frequency; Indexing; Information retrieval; Large scale integration; Matrix decomposition; CASIT; Component; LSI; textual research; vectorial model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Information Networking and Applications Workshops (WAINA), 2010 IEEE 24th International Conference on
Conference_Location :
Perth, WA
Print_ISBN :
978-1-4244-6701-3
Type :
conf
DOI :
10.1109/WAINA.2010.133
Filename :
5480625
Link To Document :
بازگشت