DocumentCode :
2127475
Title :
A Web entity activity recognition approach based on k-nearest neighbors classifier
Author :
Liu, Hui ; Zhang, Chuanyan
Author_Institution :
Sch. of Mech. Eng., Shandong Univ., Jinan, China
fYear :
2012
fDate :
21-23 April 2012
Firstpage :
848
Lastpage :
852
Abstract :
Based on the traditional information extraction, this paper puts forward an approach to recognizing entity activity form Web. Entity activity which describes the behaviour or action information about the entity is valuable to recognize for search engine, business intelligence, data integration and so on. Since Web entity activities, as multi-relationships, are contained in unstructured text in the Web, recognizing entity activities is a difficult problem in current information extraction area. In this paper, we rigorously define the recognition task firstly and propose a novel recognition framework to extract entity activities form Web. Firstly, this paper proposes a k-nearest neighbors classifier based on sentence similarity to discover the sentences which contain web entity activity. Then, according to dependency parsing feature, we use a set of heuristic rules to extract the information from sentences. Experimental results demonstrate the feasibility and effectiveness of our proposed approach and it can adapt to multi-domains.
Keywords :
Internet; grammars; information retrieval; pattern classification; text analysis; Web entity activity recognition; action information; business intelligence; data integration; dependency parsing feature; heuristic rule; information extraction; k-nearest neighbors classifier; recognition task; search engine; sentence similarity; unstructured text; Bayesian methods; Companies; Data mining; Educational institutions; Feature extraction; Text recognition; Web pages; information extraction; k-nearest neighbors classifier; web entity activity;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Consumer Electronics, Communications and Networks (CECNet), 2012 2nd International Conference on
Conference_Location :
Yichang
Print_ISBN :
978-1-4577-1414-6
Type :
conf
DOI :
10.1109/CECNet.2012.6202019
Filename :
6202019
Link To Document :
بازگشت