Title :
A comparison of multi-label techniques based on problem transformation for protein functional prediction
Author :
Giraldo-Forero, A.F. ; Jaramillo-Garzon, Jorge Alberto ; Castellanos-Dominguez, C.G.
Author_Institution :
Signal Process. & Recognition Group, Univ. Nac. de Colombia, Manizales, Colombia
Abstract :
A comparative analysis of four multi-label classification methods is performed in order to determine the best topology for the problem of protein function prediction, using support vector machines as base classifiers. Comparisons are done in terms of performance and computational cost of parallelized versions of the algorithms, for determining its applicability in high-throughput scenarios. Results show that the performance of the binary relevance strategy, together with a technique of class balance, remains above several recently proposed techniques for the problem at hand, while employing the smallest computational cost when parallelized. However, stacked classifiers and chain classifications can be conveniently used in pipelines, due to the low number of false positives reported.
Keywords :
bioinformatics; molecular biophysics; pattern classification; proteins; support vector machines; base classifiers; binary relevance strategy; chain clasifications; class balance; comparative analysis; computational cost; high-throughput scenarios; multilabel classification methods; parallelized algorithm versions; problem transformation; protein functional prediction; stacked classfiers; support vector machines; Correlation; Databases; Proteins; Sensitivity; Support vector machines; Topology; Training; Bioinformatics; Multi-label learning; Protein annotation; Support Vector Machines;
Conference_Titel :
Engineering in Medicine and Biology Society (EMBC), 2013 35th Annual International Conference of the IEEE
Conference_Location :
Osaka
DOI :
10.1109/EMBC.2013.6610094