DocumentCode :
2702350
Title :
Multiresolution approaches to representation and visualization of large influenza virus sequence datasets
Author :
Zaslavsky, Leonid ; Bao, Yiming ; Tatusova, Tatiana A.
Author_Institution :
Nat. Center for BioTechnol. Inf., Bethesda
fYear :
2007
fDate :
2-4 Nov. 2007
Firstpage :
109
Lastpage :
114
Abstract :
Rapid growth of the amount of genome sequence data requires enhancing exploratory analysis tools, with analysis being performed in a fast and robust manner. Users need data representations serving different purposes: from seeing overall structure and data coverage to evolutionary processes during a particular season. Our approach to the problem is in constructing hierarchies of data representations, and providing users with representations adaptable to specific goals. It can be done efficiently because the structure of a typical influenza dataset is characterized by low estimated values of the Kolmogorov (box) dimension. Multi-scale methodologies allow interactive visual representation of the dataset and accelerate computations by importance sampling. Our tree visualization approach is based on a subtree aggregation with subscale resolution. It allows interactive refinements and coarsening of subtree views. For importance sampling large influenza datasets, we construct sets of well-scattered points (e-nets). While a tree build for a global sample provides a coarse-level representation of the whole dataset, it can be complemented by trees showing more details in chosen areas. To reflect both global dataset structure and local details correctly, we perform local refinement gradually, using a multiscale hierarchy of e-nets. Our hierarchical representations allow fast metadata searching.
Keywords :
biology computing; data visualisation; genetics; meta data; microorganisms; molecular biophysics; molecular configurations; tree data structures; Kolmogorov box dimension; data representation hierarchy; exploratory analysis tools; fast metadata searching; genome sequence data; influenza virus sequence dataset representation; influenza virus sequence dataset visualization; interactive dataset visual representation; interactive subtree view coarsening; interactive subtree view refinement; subtree aggregation; tree visualization; Bioinformatics; Biotechnology; Data analysis; Genomics; Influenza; Information analysis; Libraries; Monte Carlo methods; Performance analysis; Visualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine Workshops, 2007. BIBMW 2007. IEEE International Conference on
Conference_Location :
Fremont, CA
Print_ISBN :
978-1-4244-1604-2
Type :
conf
DOI :
10.1109/BIBMW.2007.4425408
Filename :
4425408
Link To Document :
بازگشت