DocumentCode :
3166350
Title :
Training Conditional Random Fields by Periodic Step Size Adaptation for Large-Scale Text Mining
Author :
Huang, Han-Shen ; Chang, Yu-Ming ; Hsu, Chun-Nan
Author_Institution :
Acad. Sinica, Taipei
fYear :
2007
fDate :
28-31 Oct. 2007
Firstpage :
511
Lastpage :
516
Abstract :
For applications with consecutive incoming training examples, on-line learning has the potential to achieve a likelihood as high as off-line learning without scanning all available training examples and usually has a much smaller memory footprint. To train CRFson-line, this paper presents the Periodic Step size Adaptation (PSA) method to dynamically adjust the learning rates in stochastic gradient descent. We applied our method to three large scale text mining tasks. Experimental results show that PSA outperforms the best off-line algorithm, L-BFGS, by many hundred times, and outperforms the best on-line algorithm, SMD, by an order of magnitude in terms of the number of passes required to scan the training data set.
Keywords :
data mining; gradient methods; learning (artificial intelligence); pattern classification; probability; random processes; stochastic processes; text analysis; conditional probability; conditional random field training; large-scale text mining; machine learning; periodic step size adaptation; sequential data classification; stochastic gradient descent; Algorithm design and analysis; Data mining; Geographic Information Systems; Information science; Iterative algorithms; Labeling; Large-scale systems; Stochastic processes; Text mining; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
ISSN :
1550-4786
Print_ISBN :
978-0-7695-3018-5
Type :
conf
DOI :
10.1109/ICDM.2007.39
Filename :
4470282
Link To Document :
بازگشت