DocumentCode
1900534
Title
Haplotype Block Partitioning using a Normalized Maximum Likelihood Model
Author
Yang, Yinghua ; Tabus, Joan
Author_Institution
Tampere Univ. of Technol., Tampere
fYear
2007
fDate
10-12 June 2007
Firstpage
1
Lastpage
4
Abstract
This paper proposes a new method for finding block structure in haplotypes. The new method belongs to the family of minimum description length (MDL) methods, which were intensively investigated in connection with this problem also in the past. Within MDL paradigm we evaluate the code length by using the normalized maximum likelihood (NML) model, as opposed to two part codes used in the past, resulting in a more compact conditional description. Also we propose a new joint clustering and encoding algorithm which selects the cluster centers by minimizing the overall code length when encoding the cluster centers (block prototypes) and the blocks conditional on the block prototypes. The minimized description length provided by the new algorithm is shown to be smaller than that obtained by previous methods when applied to real haplo-type data. The inference of the block boundaries using this better code length measure produces different results than the previous methods, reducing significantly the description length of the overall haplotype partition.
Keywords
biology; genetics; maximum likelihood estimation; pattern clustering; clustering algorithm; encoding algorithm; haplotype block partitioning; human genetic variation; minimized description length; minimum description length methods; normalized maximum likelihood model; single nucleotide polymorphism; Biological cells; Clustering algorithms; Encoding; Inference algorithms; Iterative algorithms; Partitioning algorithms; Prototypes; Sequences; Signal processing; Signal processing algorithms;
fLanguage
English
Publisher
ieee
Conference_Titel
Genomic Signal Processing and Statistics, 2007. GENSIPS 2007. IEEE International Workshop on
Conference_Location
Tuusula
Print_ISBN
978-1-4244-0998-3
Electronic_ISBN
978-1-4244-0999-0
Type
conf
DOI
10.1109/GENSIPS.2007.4365840
Filename
4365840
Link To Document