DocumentCode :
2118398
Title :
Using Patterns Co-occurrence Matrix for Cleaning Closed Sequential Patterns for Text Mining
Author :
Albathan, Mubarak ; Yuefeng Li ; Algarni, Abdulmohsen
Author_Institution :
Sci. & Eng. Fac., Queensland Univ. of Technol., Brisbane, QLD, Australia
Volume :
1
fYear :
2012
fDate :
4-7 Dec. 2012
Firstpage :
201
Lastpage :
205
Abstract :
With the overwhelming increase in the amount of texts on the web, it is almost impossible for people to keep abreast of up-to-date information. Text mining is a process by which interesting information is derived from text through the discovery of patterns and trends. Text mining algorithms are used to guarantee the quality of extracted knowledge. However, the extracted patterns using text or data mining algorithms or methods leads to noisy patterns and inconsistency. Thus, different challenges arise, such as the question of how to understand these patterns, whether the model that has been used is suitable, and if all the patterns that have been extracted are relevant. Furthermore, the research raises the question of how to give a correct weight to the extracted knowledge. To address these issues, this paper presents a text post-processing method, which uses a pattern co-occurrence matrix to find the relation between extracted patterns in order to reduce noisy patterns. The main objective of this paper is not only reducing the number of closed sequential patterns, but also improving the performance of pattern mining as well. The experimental results on Reuters Corpus Volume 1 data collection and TREC filtering topics show that the proposed method is promising.
Keywords :
Internet; data mining; information filtering; matrix algebra; text analysis; Reuters Corpus Volume 1 data collection; TREC filtering topics; Web; closed sequential pattern cleaning; data mining algorithms; knowledge extraction; pattern co-occurrence matrix; pattern discovery; pattern mining; text mining; text post-processing method; trend discovery; Closed Sequential pattern; Information retrieval; Pattern co-occurrence matrix; Text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2012 IEEE/WIC/ACM International Conferences on
Conference_Location :
Macau
Print_ISBN :
978-1-4673-6057-9
Type :
conf
DOI :
10.1109/WI-IAT.2012.131
Filename :
6511885
Link To Document :
بازگشت