Title :
The coding of literary form: Data mining and the information structure of historical texts
Author_Institution :
Department of English, Augsburg College, Minneapolis, Minnesota
Abstract :
This working paper argues that many data-mining projects in the humanities limit themselves by choosing words as their default unit of analysis. Some authors, problems, and forms are better illuminated by analysis of individual textual symbols, others by examination of multiword constructions. Insights about the nature of code from mathematical information theory, long but perhaps prematurely rejected by humanists on theoretical grounds, may give researchers less subjective and more powerful tools by which to measure the information characteristics of texts and the innovations of specific historical writers.
Keywords :
"Computers","Volume measurement","Databases","Scholarships","Big data","Data mining","Information theory"
Conference_Titel :
Big Data (Big Data), 2015 IEEE International Conference on
DOI :
10.1109/BigData.2015.7363936