• DocumentCode
    2148322
  • Title

    Document Images Indexing with Relevance Feedback: An Application to Industrial Context

  • Author

    Augereau, O. ; Journet, N. ; Domenger, J.P.

  • Author_Institution
    Lab. Bordelais de Rech. en Inf. (LaBRI), Univ. de Bordeaux, Talence, France
  • fYear
    2011
  • fDate
    18-21 Sept. 2011
  • Firstpage
    1190
  • Lastpage
    1194
  • Abstract
    This article presents a new method to index document images. This work is done in an industrial context where thousands of document images are daily digitized, these images have to be sorted in different classes like payroll, various bills, information letters. We propose a software method which aims to accelerate this task. Usually, the number of document classes is a priori unknown. In this paper, we propose an automatic estimation of this class number. According to this class number, we use a clustering algorithm in order to group document images. After this step, we propose an assisted classification tool based on content based image retrieval method (CBIR). For each cluster, a reference image is automatically selected then considering a similarity measure, the other images are sorted and shown to the user. By interacting with the process, the user can reject wrong images. The user feedback is automatically taken into account to enhance the similarity measure by weighting each feature. The first tests show that, on average, databases are indexed 3 times faster with our assisted classification method than with a standard manual classification process.
  • Keywords
    content-based retrieval; document image processing; image classification; image retrieval; pattern clustering; visual databases; CBIR; automatic estimation; content based image retrieval method; document images indexing; industrial context application; pattern classification; pattern clustering; relevance feedback; software method; Accuracy; Companies; Estimation; Humans; Indexing; Labeling; document clustering; document retrieval; feature selection; industrial application; relevance feedback;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition (ICDAR), 2011 International Conference on
  • Conference_Location
    Beijing
  • ISSN
    1520-5363
  • Print_ISBN
    978-1-4577-1350-7
  • Electronic_ISBN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2011.240
  • Filename
    6065498