• DocumentCode
    3261958
  • Title

    A fast chinese web-document clustering method under Pareto’s Principle

  • Author

    Tianlei, Zhang ; Guishen, Chen ; Hao, Che

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing
  • fYear
    2008
  • fDate
    26-28 Aug. 2008
  • Firstpage
    801
  • Lastpage
    804
  • Abstract
    Nowadays most search engine like Google, Baidu, demonstrate their query results by the value of item, listing them in several pages. As we are now in an age of information explosion, the number of pages will be huge and users have to glance over several before they get what they want. If we cluster the results, this problem will be solved. There are several clustering methods, but not quite accurate and efficient, especially when the result sets are consist of millions of items. this article describe an fast method under Paretopsilas Principle.
  • Keywords
    document handling; pattern clustering; search engines; Baidu; Chinese Web-document clustering method; Google; Pareto principle; information explosion; search engine; Artificial intelligence; Clustering algorithms; Clustering methods; Explosions; Internet; Machine learning algorithms; Search engines; Systems engineering and theory; Testing; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Granular Computing, 2008. GrC 2008. IEEE International Conference on
  • Conference_Location
    Hangzhou
  • Print_ISBN
    978-1-4244-2512-9
  • Electronic_ISBN
    978-1-4244-2513-6
  • Type

    conf

  • DOI
    10.1109/GRC.2008.4664707
  • Filename
    4664707