Title : 
A Semantic Kernel for Semi-structured DocumentS
         
        
            Author : 
Aseervatham, Sujeevan ; Viennet, Emmanuel ; Bennani, Younes
         
        
            Author_Institution : 
Inst. Galilee Univ. Paris 13, Villetaneuse
         
        
        
        
        
        
            Abstract : 
Natural Language Processing has emerged as an active field of research in the machine learning community. Several methods based on statistical information have been proposed. However, with the linguistic complexity of the texts, semantic-based approaches have been investigated. In this paper, we propose a Semantic Kernel for semi- structured biomedical documents. The semantic meanings of words are extracted using the UMLS framework. The kernel, with a SVM classifier, has been applied to a text categorization task on a medical corpus of free text documents. The results have shown that the Semantic Kernel outperforms the Linear Kernel and the Naive Bayes classifier. Moreover, this kernel was ranked in the top ten of the best algorithms among 44 classification methods at the 2007 CMC Medical NLP International Challenge.
         
        
            Keywords : 
learning (artificial intelligence); medical information systems; natural language processing; semantic networks; support vector machines; text analysis; 2007 CMC Medical NLP International Challenge; SVM classifier; UMLS framework; linguistic complexity; machine learning; medical corpus; natural language processing; semantic kernel; semantic-based approaches; semi- structured biomedical documents; statistical information; text categorization task; Data mining; Feature extraction; Humans; Kernel; Machine learning; Natural language processing; Support vector machines; Text categorization; Tree data structures; Unified modeling language;
         
        
        
        
            Conference_Titel : 
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
         
        
            Conference_Location : 
Omaha, NE
         
        
        
            Print_ISBN : 
978-0-7695-3018-5
         
        
        
            DOI : 
10.1109/ICDM.2007.23