DocumentCode :
671702
Title :
Dealing with highly imbalanced textual data gathered into similar classes
Author :
Lamirel, Jean-Charles
Author_Institution :
Synalp Team, LORIA, Nancy, France
fYear :
2013
fDate :
4-9 Aug. 2013
Firstpage :
1
Lastpage :
7
Abstract :
This paper deals with a new feature selection and feature contrasting approach for classification of highly imbalanced textual data with a high degree of similarity between associated classes. An example of such classification context is illustrated by the task of classifying bibliographic references into a patent classification scheme. This task represents one of the domains of investigation of the QUAERO project, with the final goal of helping experts to evaluate upcoming patents through the use of related research.
Keywords :
feature selection; learning (artificial intelligence); patents; pattern classification; text analysis; QUAERO project; bibliographic reference classification; degree of similarity; feature contrasting approach; feature selection; highly imbalanced textual data; patent classification scheme; Accuracy; Context; Feature extraction; Labeling; Measurement; Patents; Principal component analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks (IJCNN), The 2013 International Joint Conference on
Conference_Location :
Dallas, TX
ISSN :
2161-4393
Print_ISBN :
978-1-4673-6128-6
Type :
conf
DOI :
10.1109/IJCNN.2013.6707044
Filename :
6707044
Link To Document :
بازگشت