Title :
SVM-based Protein-Protein Interaction Extraction from Medline abstracts
Author :
Cui, Baojin ; Lin, Hongfei ; Yang, Zhihao
Author_Institution :
Dept. of Comput. Sci. & Eng., Dalian Univ. of Technol., Dalian
Abstract :
Nowadays, protein-protein interaction (PPI) extraction has become a research focus. Many methods have been applied to this domain, such as supervised learning approaches. This paper applied support vector machine (SVM) to extract PPI, which bases on several lexical features and one syntactic feature achieved through link grammar parser. Due to syntax´s complexity different sentence structure can not have the same parse tree, which leads to the data sparseness problem here. In order to solve the sparseness problem, syntactic feature is used in a simple way. This special syntactic feature helps to improve F-score nearly five percentage points in supervised learning approach.
Keywords :
bibliographic systems; data mining; grammars; learning (artificial intelligence); medical information systems; molecular biophysics; proteins; support vector machines; Medline abstract; SVM; data sparseness problem; lexical feature; link grammar parser; parse tree; protein-protein interaction extraction; supervised learning; support vector machine; syntactic feature; Abstracts; Biological processes; Classification tree analysis; Computer science; Data mining; Entropy; Kernel; Protein engineering; Supervised learning; Support vector machines;
Conference_Titel :
Bio-Inspired Computing: Theories and Applications, 2007. BIC-TA 2007. Second International Conference on
Conference_Location :
Zhengzhou
Print_ISBN :
978-1-4244-4105-1
Electronic_ISBN :
978-1-4244-4106-8
DOI :
10.1109/BICTA.2007.4806446