مرکز منطقه ای اطلاع رساني علوم و فناوري - Dealing with highly imbalanced textual data gathered into similar classes

DocumentCode :

671702

Title :

Dealing with highly imbalanced textual data gathered into similar classes

Author :

Lamirel, Jean-Charles

Author_Institution :

Synalp Team, LORIA, Nancy, France

fYear :

2013

fDate :

4-9 Aug. 2013

Firstpage :

Lastpage :

Abstract :

This paper deals with a new feature selection and feature contrasting approach for classification of highly imbalanced textual data with a high degree of similarity between associated classes. An example of such classification context is illustrated by the task of classifying bibliographic references into a patent classification scheme. This task represents one of the domains of investigation of the QUAERO project, with the final goal of helping experts to evaluate upcoming patents through the use of related research.

Keywords :

feature selection; learning (artificial intelligence); patents; pattern classification; text analysis; QUAERO project; bibliographic reference classification; degree of similarity; feature contrasting approach; feature selection; highly imbalanced textual data; patent classification scheme; Accuracy; Context; Feature extraction; Labeling; Measurement; Patents; Principal component analysis;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks (IJCNN), The 2013 International Joint Conference on

Conference_Location :

Dallas, TX

ISSN :

2161-4393

Print_ISBN :

978-1-4673-6128-6

Type :

conf

DOI :

10.1109/IJCNN.2013.6707044

Filename :

6707044

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=671702