DocumentCode :
922674
Title :
Word Indexing of Ancient Documents Using Fuzzy Classification
Author :
Sousa, João M C ; Gil, João M. ; Pinto, J.R.C.
Author_Institution :
Lisbon Tech. Univ., Lisbon
Volume :
15
Issue :
5
fYear :
2007
Firstpage :
852
Lastpage :
862
Abstract :
This paper proposes a fuzzy classification system to perform word indexing in ancient printed documents. The indexing system receives a given word selected by an user. The word is preprocessed using an aspect ratio filter, assuring that only interesting word candidates are considered. The image is classified by oriented feature extraction using Gabor filter banks. The oriented features are used to generate membership functions that characterize the selected word. This target word image is then compared to the potential matches, using a similarity matrix. The indexing system is flexible and lightweight when compared to other optimal recognizers, which allows its use in "real-time" applications. A significant test revealed that the indexer achieved very good results in terms of precision and recall in texts from XVIIth century.
Keywords :
Gabor filters; feature extraction; filtering theory; fuzzy set theory; image classification; indexing; matrix algebra; word processing; Gabor filter banks; ancient printed documents; aspect ratio filter; feature extraction; fuzzy classification systems; membership functions; real-time applications; similarity matrix; word indexing; Clustering algorithms; Content based retrieval; Fuzzy systems; Gabor filters; Image processing; Image retrieval; Image storage; Indexing; Information retrieval; Optical character recognition software; Fuzzy classification; fuzzy indexing; old documents; word recognition;
fLanguage :
English
Journal_Title :
Fuzzy Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6706
Type :
jour
DOI :
10.1109/TFUZZ.2006.889933
Filename :
4343120
Link To Document :
بازگشت