DocumentCode
3439360
Title
Pattern-Based Topic Models for Information Filtering
Author
Gao, Yuan ; Xu, Yan ; Li, Yuhua
Author_Institution
Fac. of Sci. & Eng., QUT, Brisbane, QLD, Australia
fYear
2013
fDate
7-10 Dec. 2013
Firstpage
921
Lastpage
928
Abstract
Topic modelling, such as Latent Dirichlet Allocation (LDA), was proposed to generate statistical models to represent multiple topics in a collection of documents, which has been widely utilized in the fields of machine learning and information retrieval, etc. But its effectiveness in information filtering is rarely known. Patterns are always thought to be more representative than single terms for representing documents. In this paper, a novel information filtering model, Pattern-based Topic Model (PBTM), is proposed to represent the text documents not only using the topic distributions at general level but also using semantic pattern representations at detailed specific level, both of which contribute to the accurate document representation and document relevance ranking. Extensive experiments are conducted to evaluate the effectiveness of PBTM by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model achieves outstanding performance.
Keywords
information filtering; pattern classification; text analysis; LDA; PBTM; TREC data collection Reuters Corpus Volume 1; document relevance ranking; document representation; information filtering model; latent Dirichlet allocation; pattern-based topic models; semantic pattern representations; statistical models; text documents; topic distributions; Data mining; Data models; Itemsets; Mathematical model; Semantics; Taxonomy; Training; Topic models; closed pattern; information filtering; pattern mining; user modelling;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
Conference_Location
Dallas, TX
Print_ISBN
978-1-4799-3143-9
Type
conf
DOI
10.1109/ICDMW.2013.30
Filename
6754020
Link To Document