DocumentCode :
3166370
Title :
Semi-supervised Document Clustering via Active Learning with Pairwise Constraints
Author :
Huang, Ruizhang ; Lam, Wai
Author_Institution :
Chinese Univ. of Hong Kong, Shatin
fYear :
2007
fDate :
28-31 Oct. 2007
Firstpage :
517
Lastpage :
522
Abstract :
This paper investigates a framework that discovers pair-wise constraints for semi-supervised text document clustering. An active learning approach is proposed to select informative document pairs for obtaining user feedbacks. A gain directed document pair selection method that measures how much we can learn by revealing the relationships between pairs of documents is designed. Three different models, namely, uncertainty model, generation error model, and objective function model are proposed. Language modeling is investigated for representing clusters in the semi-supervised document clustering approach.
Keywords :
learning (artificial intelligence); pattern clustering; text analysis; active learning; gain directed document pair selection method; generation error model; informative document pairs; language modeling; objective function model; pairwise constraints; semi supervised text document clustering; uncertainty model; user feedback; Data engineering; Data mining; Feedback; Gain measurement; Machine learning; Parametric statistics; Probability distribution; Research and development management; Systems engineering and theory; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
ISSN :
1550-4786
Print_ISBN :
978-0-7695-3018-5
Type :
conf
DOI :
10.1109/ICDM.2007.79
Filename :
4470283
Link To Document :
بازگشت