DocumentCode :
1636500
Title :
Use of NER Information for Improved Topic Tracking
Author :
Xiaowei, Wang ; Jiang Longbin ; Ma Jialin ; Jiangyan
Author_Institution :
Software Coll., Shenyang Normal Univ., Shenyang
Volume :
3
fYear :
2008
Firstpage :
165
Lastpage :
170
Abstract :
The aim of topic tracking is to monitor the stream of news stories to find additional stories on a topic that was identified using several sample stories. We propose a method that using NER information for improved topic tracking. We call it multi-vector. We extract proper names, locations and normal terms into distinct sub-vectors of the document representation. Measuring the similarity of two documents is conducted by comparing two sub-vectors at a time. We use TDT4 corpus as test corpus and compare the topic tracking system performance between the system based on multi-vector model and the system based on traditional vector space model. We also analyze the number of features that effect topic tracking performance. The experimental result shows that the tracking performance will be improved by using multi-vector model.
Keywords :
classification; feature extraction; text analysis; vectors; NER feature extraction; document representation; multivector space model; news story; text classification; topic tracking; Application software; Data mining; Educational institutions; Event detection; Information retrieval; Intelligent systems; Isolation technology; Monitoring; System testing; Time measurement; NER information; TDT4 corpus; multi-vector model; topic tracking;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Systems Design and Applications, 2008. ISDA '08. Eighth International Conference on
Conference_Location :
Kaohsiung
Print_ISBN :
978-0-7695-3382-7
Type :
conf
DOI :
10.1109/ISDA.2008.136
Filename :
4696456
Link To Document :
بازگشت