Title : 
A Novel Document Analysis Method Using Compressibility Vector
         
        
            Author : 
Zhang, Nuo ; Watanabe, Toshinori ; Matsuzaki, Daisuke ; Koga, Hisashi
         
        
            Author_Institution : 
Univ. of Electro-Commun., Tokyo
         
        
        
        
        
        
            Abstract : 
Similarity analysis and keyword extraction are widely used as document relation analysis techniques. These methods are based on dictionary-base morphological analysis. However, they cannot meet the need when Internet grows fast and new words appear but dictionary can not be renewed fast enough. In this study, we propose a new document relation analysis method based on the document´s compressibility. The effectiveness of the proposed method will be examined in simulations.
         
        
            Keywords : 
data compression; dictionaries; document handling; dictionary-base morphological analysis; document compressibility vector; document relation analysis method; keyword extraction; similarity analysis; Algorithm design and analysis; Data compression; Data mining; Data privacy; Dictionaries; Information analysis; Information systems; Internet; Text analysis; Web pages;
         
        
        
        
            Conference_Titel : 
Data, Privacy, and E-Commerce, 2007. ISDPE 2007. The First International Symposium on
         
        
            Conference_Location : 
Chengdu
         
        
            Print_ISBN : 
978-0-7695-3016-1
         
        
        
            DOI : 
10.1109/ISDPE.2007.93