DocumentCode :
457398
Title :
Summarization of JBIG2 Compressed Indian Language Textual Images
Author :
Garain, Utpal ; Datta, Alok K. ; Bhattacharya, U. ; Parui, S.K.
Author_Institution :
Indian Stat. Inst., Kolkata
Volume :
3
fYear :
0
fDate :
0-0 0
Firstpage :
344
Lastpage :
347
Abstract :
This paper presents a method for automatic summarization of JBIG2 coded textual images without optical character recognition (OCR). Compressed images are partially (less than 10% of the uncompressed image size) decompressed and text lines and words are marked. A few features are computed at each sentence level. Based on the feature values sentences are then marked as a summary sentence or not. The system finally generates a set of sentences as summary. In addition, sentences are ranked within the summary. Experiment considers Indian language text images. Test results show a sentence selection efficiency of about 56% when judged against summarization generated by human. A nonparametric (distribution-free) rank statistic shows a correlation coefficient of 0.28 as a measure of the (minimum) strength of the associations between sentence ranking by machine and human
Keywords :
data compression; document image processing; image coding; natural languages; Indian language textual image summarization; JBIG2 compressed textual image; nonparametric distribution-free rank statistic; Character recognition; Humans; Image coding; Image retrieval; Information retrieval; Libraries; Optical character recognition software; Prototypes; Statistical distributions; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 2006. ICPR 2006. 18th International Conference on
Conference_Location :
Hong Kong
ISSN :
1051-4651
Print_ISBN :
0-7695-2521-0
Type :
conf
DOI :
10.1109/ICPR.2006.1090
Filename :
1699536
Link To Document :
بازگشت