Title :
Who Are We? Mining Institutional Identities Using n-grams
Author :
Soper, Daniel S. ; Turel, Ofir
Author_Institution :
Inf. Syst. & Decision Sci. Dept., California State Univ., Fullerton, CA, USA
Abstract :
Disciplines and organizations alike can be defined by the text they produce, the topics they discuss, and the language they employ. Analyzing such large amounts of text is challenging, but is nevertheless needed because it can help stakeholders to understand key themes in, and the evolution of their corporate or disciplinary identity. N-gram analysis is a leading text-mining technique that can be leveraged for this purpose. In this manuscript we present the development and demonstrate the potential utility of an n-gram analysis tool. We focus on revealing several aspects of the identity of an academic journal, namely Communications of the ACM (CACM), through the analysis of over 14 million unique n-grams and their relative frequencies. The results of the study imply that n-gram analyses may be a key tool in resolving the IS identity crisis. Implications for research and practice are discussed.
Keywords :
data mining; text analysis; CACM; Communications of the ACM; N-gram analysis; academic journal; institutional identities mining; text-mining technique; Databases; Frequency conversion; History; Industries; Planning; Time frequency analysis;
Conference_Titel :
System Science (HICSS), 2012 45th Hawaii International Conference on
Conference_Location :
Maui, HI
Print_ISBN :
978-1-4577-1925-7
Electronic_ISBN :
1530-1605
DOI :
10.1109/HICSS.2012.642