Title of article
The Sichel model and the mixing and truncation order
Author/Authors
Xavier Puig، نويسنده , , Josep Ginebra & Marti Font، نويسنده ,
Issue Information
روزنامه با شماره پیاپی سال 2010
Pages
19
From page
1585
To page
1603
Abstract
The analysis of word frequency count data can be very useful in authorship attribution problems. Zerotruncated
generalized inverse Gaussian–Poisson mixture models are very helpful in the analysis of these
kinds of data because their model-mixing density estimates can be used as estimates of the density of the
word frequencies of the vocabulary. It is found that this model provides excellent fits for theword frequency
counts of very long texts, where the truncated inverse Gaussian–Poisson special case fails because it does
not allow for the large degree of over-dispersion in the data. The role played by the three parameters of
this truncated GIG-Poisson model is also explored. Our second goal is to compare the fit of the truncated
GIG-Poisson mixture model with the fit of the model that results from switching the order of the mixing
and truncation stages. A heuristic interpretation of the mixing distribution estimates obtained under this
alternative GIG-truncated Poisson mixture model is also provided.
Keywords
Poisson mixture , stylometry , truncated mixture , truncated model , Word frequency , Categorical data , Mixture model , generalized inverse Gaussian
Journal title
JOURNAL OF APPLIED STATISTICS
Serial Year
2010
Journal title
JOURNAL OF APPLIED STATISTICS
Record number
712481
Link To Document