Title :
A self-training semi-supervised support vector machine method for recognizing transcription start sites
Author :
Huang, Jun Cai ; Wang, Feng Bi ; Mao, Huan Zhang ; Zhou, Ming Tian
Author_Institution :
Coll. of Comput. Sci. & Eng., Univ. of Electron. Sci. & Technol. of China, Chengdu, China
Abstract :
The task of finding transcription start sites (TSSs) can be modeled as a classification problem. Semi-Supervised Support Vector Machines (S3VMs) are an appealing method for using unlabeled data in classification. Based incorporation prior biological knowledge for recognizing TSSs, propose a Self-Training S3VMs (ST-S3VMs) algorithm. ST-S3VM builds a SVM classifier based small amounts of labeled data and large amounts of unlabeled data, incorporates prior biological knowledge by engineering an appropriate kernel function with a self-training algorithm The algorithm has been implemented and tested on previously published data. Our experimental results on real nucleotide sequences data show that our method improve the prediction accuracy greatly and our method performs significantly better than ESTSCAN and SVMs with Salzberg kernel.
Keywords :
biology computing; pattern classification; support vector machines; ESTSCAN; ST-S3VM; SVM classifier; Salzberg kernel; appealing method; biological knowledge; classification problem; kernel function; nucleotide sequences data; self-training S3VM; self-training algorithm; self-training semisupervised support vector machine; semisupervised support vector machines; transcription start sites; unlabeled data; Biological information theory; Classification algorithms; Correlation; Kernel; Support vector machines; Training; Recognizing TSSs; Self-Training Semi-Supervised SVM; kernel function; prior biological knowledge; style;
Conference_Titel :
Apperceiving Computing and Intelligence Analysis (ICACIA), 2010 International Conference on
Conference_Location :
Chengdu
Print_ISBN :
978-1-4244-8025-8
DOI :
10.1109/ICACIA.2010.5709922