DocumentCode :
553162
Title :
A novel approach to compute pattern history for trend analysis
Author :
Jing-Doo Wang
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Asia Univ., Taichung, Taiwan
Volume :
3
fYear :
2011
fDate :
26-28 July 2011
Firstpage :
1746
Lastpage :
1750
Abstract :
It is attractive to observe the history of one pattern in the retrospective corpus such that one might sense the trends related to that pattern efficiently, where one pattern history was defined as the frequency distribution of that pattern over time. Pattern history could provide information analysts with valuable information and clues for trend analysis. Note that one pattern could be a token or a sequence of words in this study. To extract significant patterns from a large amount of texts, and meanwhile compute the corresponding patterns histories, a scalable and external memory approach based on bucket-like suffixes sorting and push-pop stack operations is proposed. To highlight the scalability and robustness of this approach, experimental data consisted of 3, 225, 549 articles (about 4 GB) downloaded from the PubMed for 20 years from 1990 to 2009, and the total computation time of patterns histories was about 48 hours using only one PC. Experimental results showed that specific patterns histories did reveal the variations of some events and gave hints for trend analysis.
Keywords :
pattern classification; sorting; text analysis; word processing; bucket-like suffix sorting; frequency distribution; information analyst; pattern extraction; pattern history; push-pop stack operation; retrospective corpus; scalability; text analysis; trend analysis; word sequence; Bioinformatics; Cancer; History; Lungs; Sorting; Time frequency analysis; USA Councils;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-180-9
Type :
conf
DOI :
10.1109/FSKD.2011.6019799
Filename :
6019799
Link To Document :
بازگشت