Title of article :
Character confidence based on N-best list for keyword spotting in online Chinese handwritten documents
Author/Authors :
Zhang، نويسنده , , Heng and Wang، نويسنده , , Da-Han and Liu، نويسنده , , Cheng-Lin، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2014
Abstract :
In keyword spotting from handwritten documents by text query, the word similarity is usually computed by combining character similarities, which are desired to approximate the logarithm of the character probabilities. In this paper, we propose to directly estimate the posterior probability (also called confidence) of candidate characters based on the N-best paths from the candidate segmentation-recognition lattice. On evaluating the candidate segmentation-recognition paths by combining multiple contexts, the scores of the N-best paths are transformed to posterior probabilities using soft-max. The parameter of soft-max (confidence parameter) is estimated from the character confusion network, which is constructed by aligning different paths using a string matching algorithm. The posterior probability of a candidate character is the summation of the probabilities of the paths that pass through the candidate character. We compare the proposed posterior probability estimation method with some reference methods including the word confidence measure and the text line recognition method. Experimental results of keyword spotting on a large database CASIA-OLHWDB of unconstrained online Chinese handwriting demonstrate the effectiveness of the proposed method.
Keywords :
Online Chinese handwritten documents , Keyword spotting , Posterior probability , Confusion network , N-best list , Confidence measure
Journal title :
PATTERN RECOGNITION
Journal title :
PATTERN RECOGNITION