Title of article
The Challenge of Optical Music Recognition
Author/Authors
BAINBRIDGE، DAVID نويسنده , , BELL، TIM نويسنده ,
Issue Information
روزنامه با شماره پیاپی سال 2001
Pages
-94
From page
95
To page
0
Abstract
The most important approaches to computer-assisted authorship attribution are exclusively based on lexical measures that either represent the vocabulary richness of the author or simply comprise frequencies of occurrence of common words. In this paper we present a fully-automated approach to the identification of the authorship of unrestricted text that excludes any lexical measure. Instead we adapt a set of style markers to the analysis of the text performed by an already existing natural language processing tool using three stylometric levels, i.e., token-level, phrase-level, and analysis-level measures. The latter represent the way in which the text has been analyzed. The presented experiments on a Modem Greek newspaper corpus show that the proposed set of style markers is able to distinguish reliably the authors of a randomly-chosen group and performs better than a lexically-based approach. However, the combination of these two approaches provides the most accurate solution (i.e., 87% accuracy). Moreover, we describe experiments on various sizes of the training data as well as tests dealing with the significance of the proposed set of style markers.
Keywords
optical music recognition , musical data acquisition , Document image analysis , Pattern recognition
Journal title
COMPUTER AND THE HUMANITIES
Serial Year
2001
Journal title
COMPUTER AND THE HUMANITIES
Record number
32086
Link To Document