DocumentCode
3373366
Title
Computation for Genomics Knowledge Discovery
Author
Butler, Greg
Author_Institution
Dept. of Comput. Sci. & Software Eng., Concordia Univ., Montreal, QC, Canada
fYear
2015
fDate
18-18 May 2015
Firstpage
46
Lastpage
50
Abstract
Knowledge discovery in genomics involves large scale graph processing and inference which is different from high-performance computing in genomics for sequence analysis. Genomics datasets are becoming increasing large and varied due to advances in biotechnology. Traditional sequence analysis therefore is computation-intensive for tasks such as assembly of reads, mapping reads to genomes, variation analysis across genomes, sequence similarity, sequence clustering, phylogenetics, and sequence motif and pattern finding. Beyond these data analysis steps come annotation steps to determine genes and their roles. This is knowledge discovery by inference from experimentally characterized genes, with provenance tracking the evidence for and against the annotation, post-processing by rules to catch systematic errors in annotation, gap-filling in systems biology network models, and propagation of changes in our knowledge of experimentally characterized genes. How can we engineer software for these kinds of systems that require high performance computing?
Keywords
data mining; genomics; sequences; software engineering; biotechnology; genomics knowledge discovery; high performance computing; large scale graph processing; sequence analysis; software engineering; Bioinformatics; Databases; Genomics; Ontologies; Organisms; Proteins; bioinformatics; change propagation; graph processing; provenance;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Engineering for High Performance Computing in Science (SE4HPCS), 2015 IEEE/ACM 1st International Workshop on
Conference_Location
Florence
Type
conf
DOI
10.1109/SE4HPCS.2015.14
Filename
7173510
Link To Document