DocumentCode :
1900534
Title :
Haplotype Block Partitioning using a Normalized Maximum Likelihood Model
Author :
Yang, Yinghua ; Tabus, Joan
Author_Institution :
Tampere Univ. of Technol., Tampere
fYear :
2007
fDate :
10-12 June 2007
Firstpage :
1
Lastpage :
4
Abstract :
This paper proposes a new method for finding block structure in haplotypes. The new method belongs to the family of minimum description length (MDL) methods, which were intensively investigated in connection with this problem also in the past. Within MDL paradigm we evaluate the code length by using the normalized maximum likelihood (NML) model, as opposed to two part codes used in the past, resulting in a more compact conditional description. Also we propose a new joint clustering and encoding algorithm which selects the cluster centers by minimizing the overall code length when encoding the cluster centers (block prototypes) and the blocks conditional on the block prototypes. The minimized description length provided by the new algorithm is shown to be smaller than that obtained by previous methods when applied to real haplo-type data. The inference of the block boundaries using this better code length measure produces different results than the previous methods, reducing significantly the description length of the overall haplotype partition.
Keywords :
biology; genetics; maximum likelihood estimation; pattern clustering; clustering algorithm; encoding algorithm; haplotype block partitioning; human genetic variation; minimized description length; minimum description length methods; normalized maximum likelihood model; single nucleotide polymorphism; Biological cells; Clustering algorithms; Encoding; Inference algorithms; Iterative algorithms; Partitioning algorithms; Prototypes; Sequences; Signal processing; Signal processing algorithms;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Genomic Signal Processing and Statistics, 2007. GENSIPS 2007. IEEE International Workshop on
Conference_Location :
Tuusula
Print_ISBN :
978-1-4244-0998-3
Electronic_ISBN :
978-1-4244-0999-0
Type :
conf
DOI :
10.1109/GENSIPS.2007.4365840
Filename :
4365840
Link To Document :
بازگشت