• DocumentCode
    3166613
  • Title

    A Text Classification Framework with a Local Feature Ranking for Learning Social Networks

  • Author

    Makrehchi, Masoud ; Kamel, Mohamed S.

  • Author_Institution
    Univ. of Waterloo, Waterloo
  • fYear
    2007
  • fDate
    28-31 Oct. 2007
  • Firstpage
    589
  • Lastpage
    594
  • Abstract
    In this paper, a text classifier framework with a feature ranking scheme is proposed to extract social structures from text data. It is assumed that only a small subset of relations between the individuals in a community is known. With this assumption, the social network extraction is translated into a classification problem. The relations between two individuals are represented by merging their document vectors and the given relations are used as labels of training data. By this transformation, a text classifier such as Rocchio is used for learning the unknown relations. We show that there is a link between the intrinsic sparsity of social networks and class imbalance. Furthermore, we show that feature ranking methods usually fail in problem with unbalanced data. In order to deal with this deficiency and re-balance the unbalanced social data, a local feature ranking method, which is called reverse discrimination, is proposed.
  • Keywords
    classification; feature extraction; social sciences computing; text analysis; document vectors; learning social network; local feature ranking; reverse discrimination; social network extraction; social structures; text classification; Data mining; Frequency estimation; Machine learning; Pattern analysis; Search engines; Social network services; Text categorization; Training data; Vocabulary; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
  • Conference_Location
    Omaha, NE
  • ISSN
    1550-4786
  • Print_ISBN
    978-0-7695-3018-5
  • Type

    conf

  • DOI
    10.1109/ICDM.2007.26
  • Filename
    4470295