DocumentCode :
3439360
Title :
Pattern-Based Topic Models for Information Filtering
Author :
Gao, Yuan ; Xu, Yan ; Li, Yuhua
Author_Institution :
Fac. of Sci. & Eng., QUT, Brisbane, QLD, Australia
fYear :
2013
fDate :
7-10 Dec. 2013
Firstpage :
921
Lastpage :
928
Abstract :
Topic modelling, such as Latent Dirichlet Allocation (LDA), was proposed to generate statistical models to represent multiple topics in a collection of documents, which has been widely utilized in the fields of machine learning and information retrieval, etc. But its effectiveness in information filtering is rarely known. Patterns are always thought to be more representative than single terms for representing documents. In this paper, a novel information filtering model, Pattern-based Topic Model (PBTM), is proposed to represent the text documents not only using the topic distributions at general level but also using semantic pattern representations at detailed specific level, both of which contribute to the accurate document representation and document relevance ranking. Extensive experiments are conducted to evaluate the effectiveness of PBTM by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model achieves outstanding performance.
Keywords :
information filtering; pattern classification; text analysis; LDA; PBTM; TREC data collection Reuters Corpus Volume 1; document relevance ranking; document representation; information filtering model; latent Dirichlet allocation; pattern-based topic models; semantic pattern representations; statistical models; text documents; topic distributions; Data mining; Data models; Itemsets; Mathematical model; Semantics; Taxonomy; Training; Topic models; closed pattern; information filtering; pattern mining; user modelling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4799-3143-9
Type :
conf
DOI :
10.1109/ICDMW.2013.30
Filename :
6754020
Link To Document :
بازگشت