Title : 
Recognizing Biomedical Named Entities in the Absence of Human Annotated Corpora
         
        
            Author : 
Gu, Baohua ; Dahl, Veronica ; Popowich, Fred
         
        
            Author_Institution : 
Simon Fraser Univ. Burnaby, Burnaby
         
        
        
            fDate : 
Aug. 30 2007-Sept. 1 2007
         
        
        
        
            Abstract : 
Biomedical named entity recognition is an important task in biomedical text mining. Currently the dominant approach is supervised learning, which requires a sufficiently large human annotated corpus for training. In this paper, we propose a novel approach aimed at minimizing the annotation requirement. The idea is to use a dictionary which is essentially a list of entity names compiled by domain experts and sometimes more readily available than domain experts themselves. Given an unlabelled training corpus, we label the sentences by a simple dictionary lookup, which provides us with highly reliable but incomplete positive data. We then run a SVM-based self-training process in the spirit of semi-supervised learning to iteratively learn from the positive and unlabelled data to build a reliable classifier. Our evaluation on the BioNLP-2004 shared task data sets suggests that the proposed method can be a feasible alternative to traditional approaches when human annotation is not available.
         
        
            Keywords : 
character recognition; classification; data mining; learning (artificial intelligence); medical computing; support vector machines; biomedical named entities recognition; biomedical text mining; dictionary lookup; human annotated corpora; self-training process; semisupervised learning; support vector machines; Abstracts; Dictionaries; Humans; Proteins; Semisupervised learning; Supervised learning; Support vector machine classification; Support vector machines; Target recognition; Text recognition;
         
        
        
        
            Conference_Titel : 
Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
         
        
            Conference_Location : 
Beijing
         
        
            Print_ISBN : 
978-1-4244-1610-3
         
        
            Electronic_ISBN : 
978-1-4244-1611-0
         
        
        
            DOI : 
10.1109/NLPKE.2007.4368014