• DocumentCode
    2704291
  • Title

    Research and Realization of Text Mining Algorithm on Web

  • Author

    Yin, Shiqun ; Qiu, Yuhui ; Ge, Jike

  • Author_Institution
    Southwest Univ., Chongqing
  • fYear
    2007
  • fDate
    15-19 Dec. 2007
  • Firstpage
    413
  • Lastpage
    416
  • Abstract
    It is recognized that text information on Web is growing at an astounding pace. Research and application of text mining on Web is an important branch in the data mining. Now people mainly use information retrieval (IR) or the search engine to look up Web information. But IR focuses on searching for information that is explicitly present but not latent knowledge in some document, the search engine can hardly according to different need of different customers and provide individual service, and it is very difficult to mine data further. However, text mining on Web aims to resolve this problem. This paper discusses an Algorithm of how to follow the appointed website or Web page according to the user´s request by using the text mining technique, how to extract and express text characteristic, how to classify the data information with feedback judgement combined with the Web page text contents for later use. We present experiments on different data set that demonstrate more effectiveness of our algorithm than traditional algorithm. The process of Web text mining, information extraction method, mining algorithm and realization technique are discussed in details.
  • Keywords
    Internet; information retrieval; search engines; text analysis; Web page; Web text mining; feedback judgement; information retrieval; realization technique; search engine; Computational intelligence; Computer security; Data mining; Feedback; Information retrieval; Search engines; Text categorization; Text mining; Text recognition; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Security Workshops, 2007. CISW 2007. International Conference on
  • Conference_Location
    Harbin
  • Print_ISBN
    978-0-7695-3073-4
  • Type

    conf

  • DOI
    10.1109/CISW.2007.4425522
  • Filename
    4425522