Title :
Scribe Identification in Medieval English Manuscripts
Author :
Gilliam, Tara ; Wilson, Richard C. ; Clark, John A.
Author_Institution :
Dept. of Comput. Sci., Univ. of York, York, UK
Abstract :
In this paper we present work on automated scribe identification on a new Middle-English manuscript dataset from around the 14th - 15th century. We discuss the image and textual problems encountered in processing historical documents, and demonstrate the effect of accounting for manuscript style on the writer identification rate. The grapheme codebook method is used to achieve a Top-1 classification accuracy of up to 77% with a modification to the distance measure. The performance of the Sparse Multinomial Logistic Regression classifier is compared against five k-nn classifiers. We also consider classification against the principal components and propose a method for visualising the principal component vectors in terms of the original grapheme features.
Keywords :
document image processing; handwriting recognition; image classification; principal component analysis; regression analysis; Middle-English manuscript dataset; Top-1 classification; automated scribe identification; distance measure; grapheme codebook; historical documents; k-nn classifiers; medieval English manuscripts; principal component vectors; sparse multinomial logistic regression classifier; writer identification; Accuracy; Data visualization; Ink; Pixel; Principal component analysis; Visualization; Writing; Character and text recognition; Document analysis and recognition;
Conference_Titel :
Pattern Recognition (ICPR), 2010 20th International Conference on
Conference_Location :
Istanbul
Print_ISBN :
978-1-4244-7542-1
DOI :
10.1109/ICPR.2010.463