• DocumentCode
    2127475
  • Title

    A Web entity activity recognition approach based on k-nearest neighbors classifier

  • Author

    Liu, Hui ; Zhang, Chuanyan

  • Author_Institution
    Sch. of Mech. Eng., Shandong Univ., Jinan, China
  • fYear
    2012
  • fDate
    21-23 April 2012
  • Firstpage
    848
  • Lastpage
    852
  • Abstract
    Based on the traditional information extraction, this paper puts forward an approach to recognizing entity activity form Web. Entity activity which describes the behaviour or action information about the entity is valuable to recognize for search engine, business intelligence, data integration and so on. Since Web entity activities, as multi-relationships, are contained in unstructured text in the Web, recognizing entity activities is a difficult problem in current information extraction area. In this paper, we rigorously define the recognition task firstly and propose a novel recognition framework to extract entity activities form Web. Firstly, this paper proposes a k-nearest neighbors classifier based on sentence similarity to discover the sentences which contain web entity activity. Then, according to dependency parsing feature, we use a set of heuristic rules to extract the information from sentences. Experimental results demonstrate the feasibility and effectiveness of our proposed approach and it can adapt to multi-domains.
  • Keywords
    Internet; grammars; information retrieval; pattern classification; text analysis; Web entity activity recognition; action information; business intelligence; data integration; dependency parsing feature; heuristic rule; information extraction; k-nearest neighbors classifier; recognition task; search engine; sentence similarity; unstructured text; Bayesian methods; Companies; Data mining; Educational institutions; Feature extraction; Text recognition; Web pages; information extraction; k-nearest neighbors classifier; web entity activity;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Consumer Electronics, Communications and Networks (CECNet), 2012 2nd International Conference on
  • Conference_Location
    Yichang
  • Print_ISBN
    978-1-4577-1414-6
  • Type

    conf

  • DOI
    10.1109/CECNet.2012.6202019
  • Filename
    6202019