Title :
Bayesian Network Structure Learning and Inference Methods for Handwriting
Author :
Puri, Mukta ; Srihari, Sargur N. ; Yi Tang
Author_Institution :
CEDAR, State Univ. of New York, Buffalo, NY, USA
Abstract :
Probabilistic models of characteristics of handwritten words are useful in forensic document examination since they can be used to answer queries such as: determine the rarity of a given style of writing of the word, find the probability of observing those characteristics in a representative database of given size, etc. The task considered here is to use a training set of samples of a word written by a representative population of individuals (with each individual´s writing of the word being described by a fixed set of discrete categorical variables), to construct directed probabilistic graphical models (Bayesian networks or BNs) and then use such models to answer probabilistic queries. However, since the BN structure learning problem is NP-hard, we propose an approximate method and analyze its performance and complexity. The proposed algorithm uses a local measure of deviance from independence (chi-squared tests between pairs of variables) and a global score (log-loss). The method builds the BN structure incrementally, by adding directed edges with high deviance and choosing the edge direction to minimize log-loss. The method is evaluated with samples of the word and obtained from a representative population of the United States with descriptive characteristic sets that are different for cursive writing and for hand-printing. For several samples obtained from the BN, the probability of random correspondence (PRC) is inferred. A measure of the discriminatory power of the characteristic set (conditional PRC) is also determined. The computational complexity of determining the probability of finding a similar one to a given sample, within a tolerance, in a database of given size, is discussed.
Keywords :
belief networks; computational complexity; document handling; inference mechanisms; probability; BN structure learning problem; Bayesian network structure learning; NP-hard problem; United States; chi-squared tests; computational complexity; conditional PRC; cursive writing; directed edges; directed probabilistic graphical models; hand-printing; handwriting; inference methods; log-loss; probabilistic queries; probability of random correspondence; Bayes methods; Forensics; Joints; Probabilistic logic; Sociology; Statistics; Writing; Bayesian networks; forensics; structure learning;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/ICDAR.2013.267