Title :
A Graph Theoretic Algorithm for Removing Redundant Protein Sequences
Author :
Liu, Pengfei ; Zeng, Zhenbing ; Qian, Ziliang ; Feng, KaiYan ; Cai, Yudong
Author_Institution :
Software Eng. Inst., East China Normal Univ., Shanghai, China
Abstract :
Many biological sequence databases have redundant sequences which are not helpful to statistical analysis and require more computational time and resources to process. This lead us to design a new and fast program to generate a non-redundant sequence set. A graph theoretic algorithm was designed to process BLAST output and remove redundant proteins from a protein sequence database. We have developed a program named BlastCuller which can be used to generate a non-redundant protein database. BlastCuller is a flexible program with a parameter of sequence similarity cutoff, which can be a decimal from 0.0 to 1.0. This program can be downloaded from http://pcal.biosino.org/BlastCuller.html.
Keywords :
biology computing; molecular biophysics; proteins; BLAST output; BlastCuller; graph theoretic algorithm; nonredundant sequence set; redundant protein sequences; Biology computing; Databases; Filtering; Graph theory; Protein engineering; Protein sequence; Software algorithms; Software engineering; Statistical analysis; Systems biology;
Conference_Titel :
Bioinformatics and Biomedical Engineering , 2009. ICBBE 2009. 3rd International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-2901-1
Electronic_ISBN :
978-1-4244-2902-8
DOI :
10.1109/ICBBE.2009.5162176