Title : 
Detection of trends of technical phrases in text mining
         
        
            Author : 
Abe, Hidenao ; Tsumoto, Shusaku
         
        
            Author_Institution : 
Sch. of Med., Shimane Univ., Matsue, Japan
         
        
        
        
        
        
            Abstract : 
In text mining processes, the importance indices of the technical terms play a key role in finding valuable patterns from various documents. Further, methods for finding emergent terms have attracted considerable attention as an important issue called temporal text mining. However, many conventional methods are not robust against changes in technical terms. In order to detect remarkable temporal trends of technical terms in given textual datasets robustly, we propose a method based on temporal changes in several importance indices by assuming the importance indices of the terms to be a dataset. The method consists of an automatic term extraction method in given documents, three importance indices from text mining studies, and temporal trends detection based on results of linear regression analysis. Empirical studies show that the three importance indices are applied to the titles of four annual conferences about data mining field as sets of documents. After detecting the temporal trends of automatically extracted phrases, we compared the trends of the technical phrases among the titles of the annual conferences.
         
        
            Keywords : 
data mining; regression analysis; text analysis; automatic term extraction method; data mining; linear regression analysis; technical phrases; text mining; trend detection; Automata; Data mining; Frequency; Hidden Markov models; Humans; Linear regression; Pattern recognition; Robustness; Text analysis; Text mining;
         
        
        
        
            Conference_Titel : 
Granular Computing, 2009, GRC '09. IEEE International Conference on
         
        
            Conference_Location : 
Nanchang
         
        
            Print_ISBN : 
978-1-4244-4830-2
         
        
        
            DOI : 
10.1109/GRC.2009.5255172