DocumentCode :
1031020
Title :
Random texts exhibit Zipf´s-law-like word frequency distribution
Author :
Li, Wentian
Author_Institution :
Rockefeller Univ., New York, NY, USA
Volume :
38
Issue :
6
fYear :
1992
fDate :
11/1/1992 12:00:00 AM
Firstpage :
1842
Lastpage :
1845
Abstract :
It is shown that the distribution of word frequencies for randomly generated texts is very similar to Zipf´s law observed in natural languages such as English. The facts that the frequency of occurrence of a word is almost an inverse power law function of its rank and the exponent of this inverse power law is very close to 1 are largely due to the transformation from the word´s length to its rank, which stretches an exponential function to a power law function
Keywords :
computational linguistics; information theory; natural languages; English; Zipf´s law; inverse power law function; natural languages; randomly generated texts; rank; statistical linguistics; word frequency distribution; Books; Frequency; Natural languages; Statistics; Testing;
fLanguage :
English
Journal_Title :
Information Theory, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9448
Type :
jour
DOI :
10.1109/18.165464
Filename :
165464
Link To Document :
بازگشت