DocumentCode :
2643109
Title :
Unsupervised multilingual concept discovery from daily online news extracts
Author :
Wang, Jenq-Haur
Author_Institution :
Nat. Taipei Univ. of Technol., Taipei, Taiwan
fYear :
2010
fDate :
23-26 May 2010
Firstpage :
132
Lastpage :
134
Abstract :
Web syndication technologies help us easily aggregate daily news from diverse sources. However, the huge amount of information makes us more difficult to read let alone digest and focus on the most important events. Therefore, we need an efficient way of news extraction and mining. In this paper, we propose an unsupervised approach to multilingual concept discovery from daily online news extracts. First, key terms are extracted statistically from short news extracts. Second, similar term candidates are grouped into concrete concepts with unsupervised term clustering methods. Our goal is automatic news processing with minimum resources, which requires no training in advance. The experimental results show the potential of the proposed approach in efficiency and effectiveness. Further investigation is needed to study the cross-lingual relation between extracted concepts.
Keywords :
Aggregates; Cellular neural networks; Clustering methods; Concrete; Data mining; Feeds; Information security; Text categorization; Text mining; Training data; Term extraction; news summarization; term clustering; text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligence and Security Informatics (ISI), 2010 IEEE International Conference on
Conference_Location :
Vancouver, BC, Canada
Print_ISBN :
978-1-4244-6444-9
Type :
conf
DOI :
10.1109/ISI.2010.5484763
Filename :
5484763
Link To Document :
بازگشت