DocumentCode
2031438
Title
Authorship analysis based on metrics
Author
Yuntao Zhang ; Ling Gong ; Yongcheng Wang
Author_Institution
Shanghai Jiaotong University
fYear
2002
fDate
19-19 June 2002
Firstpage
199
Lastpage
199
Abstract
Summary form only given, as follows. Authorship analysis is to identify the authors of texts by genre, attributions, features and traits that are unique for a particular author. Another related issue of authorship analysis is to discriminate two authors by the distinguishing characteristic of authors. The computational linguistics will be divided into two layers. The bottom layer is interested in lexical information, stylistics and terminology in text and the upper layer is about structure and layout of text.Stylistic features and terminology statistics is concern with words and their pattern in particular corpus. The linguistic measures of bottom layer contain morpheme, average word length distributions, vocabulary distribution,word frequency, words order, average sentence length and sentence structure.The upper layer analysis does not only treats text as ??bag of words???? or ??set of words??. Furthermore, it contains not only structure and layout of text but also the uses and distribution of the various punctuation marks. The measures of texstructure contain the average paragraph length,the average section and chapter length, the uses and distribution of subtitles. Vocabulary richness of text is measured by word spectrum of a text and the weighted use of each vocabulary.
Keywords
Computational linguistics; Frequency measurement; Iterative algorithms; Length measurement; Sections; Sliding mode control; Statistical distributions; Terminology; Text categorization; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Control and Automation, 2002. ICCA. Final Program and Book of Abstracts. The 2002 International Conference on
Conference_Location
Xiamen, Fujian Province, China
Print_ISBN
0-7803-7412-6
Type
conf
DOI
10.1109/ICCA.2002.1229659
Filename
1229659
Link To Document