Title :
Identification of Ancient Greek Papyrus Fragments Using Genetic Sequence Alignment Algorithms
Author :
Williams, Alex C. ; Carroll, Hyrum D. ; Wallin, John F. ; Brusuelas, James ; Fortson, Lucy ; Lamblin, Anne-Francoise ; Haoyu Yu
Author_Institution :
Middle Tennessee State Univ., Murfreesboro, TN, USA
Abstract :
Papyrologists analyze, transcribe, and edit papyrus fragments in order to enrich modern lives by better understanding the linguistics, culture, and literature of the ancient world. One of their common tasks is to match an unknown fragment to a known manuscript. This is especially challenging when the fragments are damaged and contain only limited information (e.g., due to deterioration). In the last 100 years, only about 10% of the more than 500,000 fragments recovered from the Egyptian village of Oxyrhynchus have been edited. We do not know what new ancient texts might be found and what can be learned from them, but using current methods of identification this process will take in excess of 1000 years. The identification of an anonymous string of characters with a collection of known text sequences is ubiquitous in computational biology. Genes are often represented by a sequence of continuous characters, each of which denotes an amino acid. Relationships are inferred by finding multi-letter patterns shared between the anonymous sequence and a known sequence. This process is commonly referred to as genetic sequence alignment. In this paper, we introduce a novel methodology that uses modern genetic sequence alignment algorithms as a method for identifying Ancient Greek text fragments. This application will offer papyrologists and other professionals in the humanities the ability to rapidly identify severely damaged texts. This approach leverages a new form of non-contextual, multi-line text identification for the Greek language that can greatly accelerate the tedious task of transcription and identification.
Keywords :
bioinformatics; data mining; genetic algorithms; humanities; amino acid; ancient Greek papyrus fragment; ancient Greek text fragment; computational biology; culture; genetic sequence alignment algorithm; humanities; linguistics; multiline text identification; noncontextual identification; Amino acids; Computational biology; Databases; Educational institutions; Error analysis; Genetics; Matrices; Ancient Greek; genetic sequence alignment; identification; papyrus;
Conference_Titel :
e-Science (e-Science), 2014 IEEE 10th International Conference on
Conference_Location :
Sao Paulo
Print_ISBN :
978-1-4799-4288-6
DOI :
10.1109/eScience.2014.14