DocumentCode
553162
Title
A novel approach to compute pattern history for trend analysis
Author
Jing-Doo Wang
Author_Institution
Dept. of Comput. Sci. & Inf. Eng., Asia Univ., Taichung, Taiwan
Volume
3
fYear
2011
fDate
26-28 July 2011
Firstpage
1746
Lastpage
1750
Abstract
It is attractive to observe the history of one pattern in the retrospective corpus such that one might sense the trends related to that pattern efficiently, where one pattern history was defined as the frequency distribution of that pattern over time. Pattern history could provide information analysts with valuable information and clues for trend analysis. Note that one pattern could be a token or a sequence of words in this study. To extract significant patterns from a large amount of texts, and meanwhile compute the corresponding patterns histories, a scalable and external memory approach based on bucket-like suffixes sorting and push-pop stack operations is proposed. To highlight the scalability and robustness of this approach, experimental data consisted of 3, 225, 549 articles (about 4 GB) downloaded from the PubMed for 20 years from 1990 to 2009, and the total computation time of patterns histories was about 48 hours using only one PC. Experimental results showed that specific patterns histories did reveal the variations of some events and gave hints for trend analysis.
Keywords
pattern classification; sorting; text analysis; word processing; bucket-like suffix sorting; frequency distribution; information analyst; pattern extraction; pattern history; push-pop stack operation; retrospective corpus; scalability; text analysis; trend analysis; word sequence; Bioinformatics; Cancer; History; Lungs; Sorting; Time frequency analysis; USA Councils;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-61284-180-9
Type
conf
DOI
10.1109/FSKD.2011.6019799
Filename
6019799
Link To Document