مرکز منطقه ای اطلاع رساني علوم و فناوري - Active Learning using Localized Generalization Error for Text Categorization

DocumentCode :

2894404

Title :

Active Learning using Localized Generalization Error for Text Categorization

Author :

Yeung, Daniel S. ; Zhang, Ying ; Ng, Wing W Y ; Chen, Qing-cai

Author_Institution :

Media & Life Sci. Comput. Lab., Harbin Inst. of Technol., Shenzhen

fYear :

2006

fDate :

13-16 Aug. 2006

Firstpage :

2686

Lastpage :

2691

Abstract :

Text categorization is one of the important steps of many applications, e.g. Web page classification, indexing in search engine and information retrieval. When the number of documents available is huge, active learning could help relief the training time and cost. Moreover, active learning is able to filter out noisy samples for training and therefore may achieve better generalization capability. In this work, we adopt the localized generalization error model to active learning for text categorization. In our approach, the samples yielding the highest generalization error for those unseen samples local to it is selected as the next training sample. The feature extraction from raw documents is also discussed. Experimental results show that the proposed method is effective in reducing the number of training samples and achieves good generalization capability

Keywords :

error statistics; feature extraction; learning (artificial intelligence); natural languages; text analysis; active learning; feature extraction; localized generalization error bound; text categorization; Cybernetics; Decision trees; Indexing; Information retrieval; Internet; Learning systems; Machine learning; Machine learning algorithms; Search engines; Support vector machine classification; Support vector machines; Text categorization; Active Learning; Localized Generalization Error Bound; Text Categorization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Machine Learning and Cybernetics, 2006 International Conference on

Conference_Location :

Dalian, China

Print_ISBN :

1-4244-0061-9

Type :

conf

DOI :

10.1109/ICMLC.2006.258926

Filename :

4028517

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2894404