Title :
New tricks for old dogs: Large alphabet probability estimation
Author :
Santhanam, N.P. ; Orlitsky, A. ; Viswanathan, K.
Author_Institution :
UC Berkeley, Berkeley
Abstract :
We develop on prior results on probability estimation obtained in [1]. We specialize the results to uniform distributions in order to obtain sampling rules for support size estimation. We consider text classification, and show that the estimators developed for probability estimation can improve current state of the art techniques.
Keywords :
estimation theory; statistical distributions; alphabet probability estimation; support size estimation; text classification; Availability; Databases; Distributed computing; Dogs; Internet; Lakes; Maximum likelihood estimation; Sampling methods; State estimation; Text categorization;
Conference_Titel :
Information Theory Workshop, 2007. ITW '07. IEEE
Conference_Location :
Tahoe City, CA
Print_ISBN :
1-4244-1564-0
Electronic_ISBN :
1-4244-1564-0
DOI :
10.1109/ITW.2007.4313149