• DocumentCode
    2265390
  • Title

    Automatically identifying join candidates in the Cairo Genizah

  • Author

    Wolf, Lior ; Littman, Rotem ; Mayer, Naama ; Dershowitz, Nachum ; Shweka, Roni ; Choueka, Yaacov

  • Author_Institution
    Blavatnik Sch. of Comput. Sci., Tel Aviv Univ., Tel Aviv, Israel
  • fYear
    2009
  • fDate
    Sept. 27 2009-Oct. 4 2009
  • Firstpage
    978
  • Lastpage
    979
  • Abstract
    A join is a set of manuscript-fragments that are known to originate from the same original work. The Cairo Genizah is a collection containing approximately 250,000 fragments of mainly Jewish texts discovered in the late 19th century. The fragments are today spread out in libraries and private collections worldwide, and there is an onging effort to document and catalogue all extant fragments. The task of finding joins is currently conducted manually by experts, and presumably only a small fraction of the existing joins have been discovered. In this work, we study the problem of automatically finding candidate joins, so as to streamline the task. The proposed method is based on a combination of local descriptors and learning techniques. To evaluate the performance of various join-finding methods, without relying on the availability of human experts, we construct a benchmark dataset that is modeled on the Labeled Faces in the Wild benchmark for face recognition. Using this benchmark, we evaluate several alternative image representations and learning techniques. Finally, a set of newly-discovered join-candidates have been identified using our method and validated by a human expert.
  • Keywords
    document image processing; face recognition; image representation; text analysis; Cairo Genizah; automatically identifying join candidates; face recognition; image representations; join finding methods; learning techniques; local descriptor combination; manuscript fragments; private collections; wild benchmark; Availability; Computer science; Computer vision; Conferences; Face recognition; Humans; Kernel; Libraries; Machine learning algorithms; Machine vision;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on
  • Conference_Location
    Kyoto
  • Print_ISBN
    978-1-4244-4442-7
  • Electronic_ISBN
    978-1-4244-4441-0
  • Type

    conf

  • DOI
    10.1109/ICCVW.2009.5457596
  • Filename
    5457596