مرکز منطقه ای اطلاع رساني علوم و فناوري - Using CoTraining and Semantic Feature Extraction for Positive and Unlabeled Text Classification

DocumentCode :

2332294

Title :

Using CoTraining and Semantic Feature Extraction for Positive and Unlabeled Text Classification

Author :

Luo, Na ; Yuan, Fuyu ; Zuo, Wanli

Author_Institution :

Coll. of Comput. & Sci. & Technol., JiLin Univ., Changchun

fYear :

2008

fDate :

20-20 Nov. 2008

Firstpage :

218

Lastpage :

221

Abstract :

This paper originally proposes a three-setp algorithm. First, CoTraining is employed for filtering out the likely positive data from the unlabeled dataset U. Second, we got vectors of documents in positive set using semantic-based feature extraction, then found the strong positive from likely positive set which is produced in first step. Those data picked out can be supplied to positive dataset P. Finally, a linear one-class SVM will learn from both the purified U as negative and the expanded P as positive. Because of the algorithm´s characteristic of automatic expanding positive dataset, the proposed algorithm especially performs well in situations where given positive dataset P is insufficient. A comprehensive experiment had proved that our algorithm is preferable to the existing ones.

Keywords :

feature extraction; pattern classification; support vector machines; text analysis; CoTraining; linear one-class SVM; positive data filtering; positive text classification; semantic feature extraction; unlabeled text classification; Employment; Feature extraction; Filtering algorithms; Information management; Information technology; Seminars; Supervised learning; Support vector machines; Technology management; Text categorization; CoTraining; SVM; Semantic Feature Extraction; WordNet;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Future Information Technology and Management Engineering, 2008. FITME '08. International Seminar on

Conference_Location :

Leicestershire, United Kingdom

Print_ISBN :

978-0-7695-3480-0

Type :

conf

DOI :

10.1109/FITME.2008.81

Filename :

4746478

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2332294