• DocumentCode
    2381451
  • Title

    A Novel Document Analysis Method Using Compressibility Vector

  • Author

    Zhang, Nuo ; Watanabe, Toshinori ; Matsuzaki, Daisuke ; Koga, Hisashi

  • Author_Institution
    Univ. of Electro-Commun., Tokyo
  • fYear
    2007
  • fDate
    1-3 Nov. 2007
  • Firstpage
    38
  • Lastpage
    40
  • Abstract
    Similarity analysis and keyword extraction are widely used as document relation analysis techniques. These methods are based on dictionary-base morphological analysis. However, they cannot meet the need when Internet grows fast and new words appear but dictionary can not be renewed fast enough. In this study, we propose a new document relation analysis method based on the document´s compressibility. The effectiveness of the proposed method will be examined in simulations.
  • Keywords
    data compression; dictionaries; document handling; dictionary-base morphological analysis; document compressibility vector; document relation analysis method; keyword extraction; similarity analysis; Algorithm design and analysis; Data compression; Data mining; Data privacy; Dictionaries; Information analysis; Information systems; Internet; Text analysis; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data, Privacy, and E-Commerce, 2007. ISDPE 2007. The First International Symposium on
  • Conference_Location
    Chengdu
  • Print_ISBN
    978-0-7695-3016-1
  • Type

    conf

  • DOI
    10.1109/ISDPE.2007.93
  • Filename
    4402633