DocumentCode
2381451
Title
A Novel Document Analysis Method Using Compressibility Vector
Author
Zhang, Nuo ; Watanabe, Toshinori ; Matsuzaki, Daisuke ; Koga, Hisashi
Author_Institution
Univ. of Electro-Commun., Tokyo
fYear
2007
fDate
1-3 Nov. 2007
Firstpage
38
Lastpage
40
Abstract
Similarity analysis and keyword extraction are widely used as document relation analysis techniques. These methods are based on dictionary-base morphological analysis. However, they cannot meet the need when Internet grows fast and new words appear but dictionary can not be renewed fast enough. In this study, we propose a new document relation analysis method based on the document´s compressibility. The effectiveness of the proposed method will be examined in simulations.
Keywords
data compression; dictionaries; document handling; dictionary-base morphological analysis; document compressibility vector; document relation analysis method; keyword extraction; similarity analysis; Algorithm design and analysis; Data compression; Data mining; Data privacy; Dictionaries; Information analysis; Information systems; Internet; Text analysis; Web pages;
fLanguage
English
Publisher
ieee
Conference_Titel
Data, Privacy, and E-Commerce, 2007. ISDPE 2007. The First International Symposium on
Conference_Location
Chengdu
Print_ISBN
978-0-7695-3016-1
Type
conf
DOI
10.1109/ISDPE.2007.93
Filename
4402633
Link To Document