DocumentCode :
1174308
Title :
Tri-training: exploiting unlabeled data using three classifiers
Author :
Zhou, Zhi-Hua ; Li, Ming
Author_Institution :
Novel Software Technol., Nanjing Univ., China
Volume :
17
Issue :
11
fYear :
2005
Firstpage :
1529
Lastpage :
1541
Abstract :
In many practical data mining applications, such as Web page classification, unlabeled training examples are readily available, but labeled ones are fairly expensive to obtain. Therefore, semi-supervised learning algorithms such as co-training have attracted much attention. In this paper, a new co-training style semi-supervised learning algorithm, named tri-training, is proposed. This algorithm generates three classifiers from the original labeled example set. These classifiers are then refined using unlabeled examples in the tri-training process. In detail, in each round of tri-training, an unlabeled example is labeled for a classifier if the other two classifiers agree on the labeling, under certain conditions. Since tri-training neither requires the instance space to be described with sufficient and redundant views nor does it put any constraints on the supervised learning algorithm, its applicability is broader than that of previous co-training style algorithms. Experiments on UCI data sets and application to the Web page classification task indicate that tri-training can effectively exploit unlabeled data to enhance the learning performance.
Keywords :
Internet; data mining; learning (artificial intelligence); pattern classification; Web page classification; co-training style algorithm; data mining; machine learning; semisupervised learning algorithm; tri-training process; unlabeled data; Data mining; Humans; Labeling; Machine learning; Machine learning algorithms; Parameter estimation; Partitioning algorithms; Semisupervised learning; Supervised learning; Web pages; Index Terms- Data mining; Web page classification.; co-training; learning from unlabeled data; machine learning; semi-supervised learning; tri-training;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2005.186
Filename :
1512038
Link To Document :
بازگشت