Title :
Large Scale Sequential Learning from Partially Labeled Data
Author :
Jianqiang Li ; Chunchen Liu ; Bo Liu
Author_Institution :
NEC Labs. China, Beijing, China
Abstract :
The success of data-driven solutions to difficult problems, along with the dropping costs of storing and processing massive amounts of data, has led to growing interest in large scale machine learning. In many cases, statistical learning problems involve sequential data, which exhibits significant sequential correlation. This fact makes the training of sequence classifier be time consuming and the application of sequential learning from large scale data is difficult, especially when the available training data are sparsely labeled. This paper proposed a novel learning approach to build the sequence classifiers from a large amount of partially labeled training data. The mechanism of semi-supervised learning for classifier building from the partially labeled data is embedded in the computing framework of ensemble learning, which is adapted for distributed learning over large scale dataset. For its practical evaluation, we conducted the empirical experiments by using Conditional Random Field (CRF) as the basic learner to detect concepts in large scale document set. The results show that our approach outperforms the best baselines significantly, which demonstrates the effectiveness of the proposed approach.
Keywords :
document handling; learning (artificial intelligence); CRF; computing framework; conditional random field; data driven solutions; data processing; distributed learning; dropping costs; ensemble learning; large scale document set; large scale machine learning; large scale sequential learning; partially labeled training data; semisupervised learning; sequence classifiers; sequential correlation; sequential data; statistical learning problems; Buildings; Electronic publishing; Encyclopedias; Internet; Training; Training data; Conditional random fields; Sequential learning; co-training; concept detection; ensemble learning; selft-training;
Conference_Titel :
Semantic Computing (ICSC), 2013 IEEE Seventh International Conference on
Conference_Location :
Irvine, CA
DOI :
10.1109/ICSC.2013.39