DocumentCode
3409352
Title
Robust named entity detection in videotext using character lattices
Author
Subramanian, Krishna ; Prasad, Rohit ; Macrostie, Ehry ; Natarajan, Prem
Author_Institution
BBN Technol., Cambridge, MA
fYear
2008
fDate
March 31 2008-April 4 2008
Firstpage
1241
Lastpage
1244
Abstract
Text in video sequences can provide key indexing information. In particular, videotext is rich in named entities (NEs) and detection of such entities is critical for search applications. Traditional approaches for detecting NEs in OCR output look for these NEs in the single-best recognition results. Due to inevitable presence of recognition errors in the single-best output, such approaches usually result in low recall. Given that a lattice is more likely to contain the correct answer, we explore NE detection from character lattices produced by our videotext OCR system. Furthermore, we use an approximate match criterion that allows insertion of punctuations during lookup. Experimental results show a 50% relative improvement in NE recall using lattices over exact lookup in the 1-best hypothesis. Since the improvement in recall is accompanied by a large number of false positives, we present techniques for reducing false alarms. In addition, we describe efficient techniques for reducing the time for detecting NEs.
Keywords
character recognition; image sequences; video signal processing; OCR; character lattices; entity detection; named entities; recognition errors; video sequences; videotext; Character generation; Engines; Feature extraction; Hidden Markov models; Indexing; Lattices; Optical character recognition software; Robustness; Text recognition; Video sequences; Character Lattices; Hidden Markov Models; Named Entities; Optical Character Recognition; Videotext;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location
Las Vegas, NV
ISSN
1520-6149
Print_ISBN
978-1-4244-1483-3
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2008.4517841
Filename
4517841
Link To Document