• DocumentCode
    2247409
  • Title

    Web information processing and extracting

  • Author

    Gao, Kai ; Zong, Bao-qin ; Yang, Xiu-li

  • Author_Institution
    Dept. of Inf. Sci. & Eng., Hebei Univ. of Sci. & Technol., Shijiazhuang, China
  • Volume
    5
  • fYear
    2010
  • fDate
    11-14 July 2010
  • Firstpage
    2350
  • Lastpage
    2355
  • Abstract
    With the rapid growth of the web, search engine has been an important tool to retrieve relevant information from the Internet. Due to the limited bandwidth, storage and some other limitations, the general search engine is not suitable for some situations. A topical search engine which is focused on collecting domain-specific issues by focused crawling is needed. It can provide higher accuracy than general search because of the lack of irrelevant information on the domain collection, so the web information processing and extracting is necessary. This paper presents some strategies on web information processing, together with analyzing and extracting based on data content mining. The experimental result validates the suitable of the approach, and some problems are also present in the end.
  • Keywords
    Internet; data mining; information retrieval; search engines; Web information extracting; Web information processing; data content mining; search engine; Accuracy; Data mining; Databases; Materials; Noise; Web pages; Crawling; Information extracting; Information processing; Topical search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
  • Conference_Location
    Qingdao
  • Print_ISBN
    978-1-4244-6526-2
  • Type

    conf

  • DOI
    10.1109/ICMLC.2010.5580664
  • Filename
    5580664