DocumentCode :
3776435
Title :
SAID: A new stemmer algorithm to indexing unstructured Document
Author :
Kabil Boukhari;Mohamed Nazih Omri
Author_Institution :
MARS Unit of Research, Department of computer sciences, Faculty of sciences of Monastir, University of Monastir, 5000, Tunisia
fYear :
2015
Firstpage :
59
Lastpage :
63
Abstract :
In this work, we propose a new stemmer algorithm to indexing unstructured Document. It can detect the most relevant words in an unstructured document. This algorithm is based on two main modules: the first module ensures the processing of compound words and the second allows the detection of the endings of the words that have not been taken into consideration by the approaches presented in literature. The proposed algorithm allows the detection and removal of suffixes and enriches the basis of suffixes by eliminating the suffixes of compound words. We have experienced our algorithm on a standard basis of terms and the results show the remarkable effectiveness of our algorithm compared to others presented in related works.
Keywords :
"Context","Standards","Knowledge based systems","Biological system modeling","Semantics","Indexing"
Publisher :
ieee
Conference_Titel :
Intelligent Systems Design and Applications (ISDA), 2015 15th International Conference on
Electronic_ISBN :
2164-7151
Type :
conf
DOI :
10.1109/ISDA.2015.7489180
Filename :
7489180
Link To Document :
بازگشت