Title :
A semantic-based text classification system
Author :
Bawakid, Abdullah ; Oussalah, Mourad
Author_Institution :
Dept. of Electron., Electr. & Comput. Eng., Univ. of Birmingham, Birmingham, UK
Abstract :
This paper presents a system that performs automatic semantic-based text categorization. Using Princeton WordNet, a series of induced methods were implemented that extract semantic features from text and utilize them to decide how similar a document is to different topics. In addition, a bag-of-words method incorporating no knowledge from WordNet is implemented in the system as a basis to compare different WordNet-based approaches. This paper describes the system and reports on a simple analysis performed to evaluate the different implemented methods. At the end, a discussion on the limitations of this study and the future work to optimize the system is presented.
Keywords :
optimisation; pattern classification; text analysis; Princeton WordNet; automatic semantic based text categorization; bag-of-words method; optimization; semantic based text classification system; Accuracy; Encyclopedias; Integrated circuits; Internet; Semantics; Text categorization; Thesauri; Categorization; Information Retrieval; Semantic Similarity; Text Classification; Word Sense Disambiguation; WordNet;
Conference_Titel :
Cybernetic Intelligent Systems (CIS), 2010 IEEE 9th International Conference on
Conference_Location :
Reading
Print_ISBN :
978-1-4244-9023-3
Electronic_ISBN :
978-1-4244-9024-0
DOI :
10.1109/UKRICIS.2010.5898112