Title :
General and domain-specific techniques for detecting and recognizing superimposed text in video
Author :
Zhang, DongQing ; Rajendran, Raj Kumar ; Chang, Shih-Fu
Author_Institution :
Dept. of Electr. Eng., Columbia Univ., New York, NY, USA
Abstract :
We have developed generic and domain-specific video algorithms for caption text extraction and recognition in digital video. Our system includes several unique features: for caption box location, we combine the compressed-domain features derived from DCT coefficients and motion vectors. Long-term temporal consistency is employed to enhance localization performance. For character segmentation, we use a single-pass threshold free approach combining classification and projection to address noisy segmentation, text intensity variation, and algorithm complexity. In recognition, we use Zernike moments to achieve more accurate recognition performance. Finally, domain knowledge is explored and a statistical transition graph model is used to enhance recognition of domain-specific characters, such as ball counts and game score of baseball videos. The algorithms achieved real-time speed and significantly improved recognition accuracy. Furthermore, although the experiments were conducted in baseball videos only, these algorithms (except the transition model) are general and can be used in other applications, such as news and films.
Keywords :
character recognition; data compression; discrete cosine transforms; image classification; image segmentation; object detection; statistical analysis; transform coding; video coding; DCT coefficients; Zernike moments; ball counts; baseball videos; caption box location; character segmentation; classification; compressed-domain features; domain-specific techniques; domain-specific video algorithms; game score; generic video algorithms; localization performance; long-term temporal consistency; motion vectors; noisy segmentation; projection; recognition accuracy; recognition performance; single-pass threshold free approach; statistical transition graph model; superimposed text detection; superimposed text recognition; text intensity variation; transition model; Character recognition; Discrete cosine transforms; Games; Image recognition; Image retrieval; Indexing; Layout; Statistical distributions; Text recognition; Video compression;
Conference_Titel :
Image Processing. 2002. Proceedings. 2002 International Conference on
Print_ISBN :
0-7803-7622-6
DOI :
10.1109/ICIP.2002.1038093