DocumentCode :
2516818
Title :
Knowledge discovery and management in life sciences: Impacts and challenges
Author :
Famili, Fazel
Author_Institution :
Knowledge Discovery Group, NRC, Ottawa, ON, Canada
fYear :
2009
fDate :
27-28 Oct. 2009
Abstract :
This paper consists of two parts. The first part provides an overview of knowledge discovery focusing on life sciences and describes the main motivations for developing and applying knowledge discovery methods to analyze complex biological data. The paper briefly describes a few case studies where the analysis of high throughput biological data using unsupervised or supervised machine learning techniques is demonstrated. These are cases in which real biological data sets (obtained from public or private sources) have been analyzed and studied for tasks such as gene function identification and gene response analysis. Several sources of public data sets will be covered, among which is GEO (gene expression omnibus) which is the most popular and well known source of today´s biological data. The objective is to show the impacts of knowledge discovery in the entire bioinformatics pipeline. This consists of data pre-processing, data characteristics recognition, pattern recognition and validation of results. In the second part, the paper describes how discovered and validated knowledge could be structured into a knowledge base where it can be integrated with other forms of knowledge, for dissemination to multiple users and its expansion. Several topics might be related to challenges in knowledge management, as this is not a trivial task and it is rather a demanding paradigm.
Keywords :
biology computing; data mining; knowledge management; unsupervised learning; GEO; bioinformatics; complex biological data; data preprocessing; gene expression omnibus; gene function identification; gene response analysis; knowledge discovery; knowledge management; life sciences; supervised machine learning; unsupervised machine learning; Bioinformatics; Biology; Character recognition; Data analysis; Gene expression; Knowledge management; Machine learning; Pattern recognition; Pipelines; Throughput;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining and Optimization, 2009. DMO '09. 2nd Conference on
Conference_Location :
Kajand
Print_ISBN :
978-1-4244-4944-6
Type :
conf
DOI :
10.1109/DMO.2009.5341924
Filename :
5341924
Link To Document :
بازگشت