DocumentCode
3166613
Title
A Text Classification Framework with a Local Feature Ranking for Learning Social Networks
Author
Makrehchi, Masoud ; Kamel, Mohamed S.
Author_Institution
Univ. of Waterloo, Waterloo
fYear
2007
fDate
28-31 Oct. 2007
Firstpage
589
Lastpage
594
Abstract
In this paper, a text classifier framework with a feature ranking scheme is proposed to extract social structures from text data. It is assumed that only a small subset of relations between the individuals in a community is known. With this assumption, the social network extraction is translated into a classification problem. The relations between two individuals are represented by merging their document vectors and the given relations are used as labels of training data. By this transformation, a text classifier such as Rocchio is used for learning the unknown relations. We show that there is a link between the intrinsic sparsity of social networks and class imbalance. Furthermore, we show that feature ranking methods usually fail in problem with unbalanced data. In order to deal with this deficiency and re-balance the unbalanced social data, a local feature ranking method, which is called reverse discrimination, is proposed.
Keywords
classification; feature extraction; social sciences computing; text analysis; document vectors; learning social network; local feature ranking; reverse discrimination; social network extraction; social structures; text classification; Data mining; Frequency estimation; Machine learning; Pattern analysis; Search engines; Social network services; Text categorization; Training data; Vocabulary; Web pages;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location
Omaha, NE
ISSN
1550-4786
Print_ISBN
978-0-7695-3018-5
Type
conf
DOI
10.1109/ICDM.2007.26
Filename
4470295
Link To Document