Title :
A hybrid strategy to protein name recognition
Author :
Wang, Haochang ; Zhao, Tiejun
Author_Institution :
Coll. of Comput. & Inf. Technol., Daqing Pet. Inst., Daqing
Abstract :
This paper presents a comprehensive approach to identifying protein name in biomedical texts. The new method integrated the generalized Winnow algorithm and the heuristic rules to implement of initial detection of protein name. Moreover, the system introduced a statistic method to analyses the reliability of recognized protein boundary, which can be then used for expanding protein boundary which has low confidence. The experimental results show that this algorithm improves the whole performance for protein name recognition and that effective performance can be achieved in identifying boundary of protein name.
Keywords :
bibliographic systems; data mining; feature extraction; medical information systems; proteins; statistical analysis; text analysis; MEDLINE; biomedical text; feature selection; generalized Winnow algorithm; heuristic rule; name entity recognition; protein name boundary recognition; statistical method; text mining; Amino acids; Automatic speech recognition; Automation; Biomedical computing; Educational institutions; Information technology; Intelligent control; Petroleum; Proteins; Text recognition; Generalized Winnow; boundary expansion; feature selection; name entity recognition;
Conference_Titel :
Intelligent Control and Automation, 2008. WCICA 2008. 7th World Congress on
Conference_Location :
Chongqing
Print_ISBN :
978-1-4244-2113-8
Electronic_ISBN :
978-1-4244-2114-5
DOI :
10.1109/WCICA.2008.4592995