Title :
Notice of Retraction
Empirical study of IDF on text classification dataset
Author :
Ziqiang Li ; Mingtian Zhou
Author_Institution :
Sch. of Comput. Sci. & Eng., Univ. of Electron. Sci. & Technol. of China, Chengdu, China
Abstract :
Notice of Retraction
After careful and considered review of the content of this paper by a duly constituted expert committee, this paper has been found to be in violation of IEEE´s Publication Principles.
We hereby retract the content of this paper. Reasonable effort should be made to remove all past references to this paper.
The presenting author of this paper has the option to appeal this decision by contacting TPII@ieee.org.
This paper observes and analyses IDF and it´s properties on the best TC dataset. We checkout the Zipf law of occuring frequence(OF) and document frequence(DF) of features. And we pay much attention to the validity of order relationship based on OF and DF, then conclude that order relationship based on linear combination of OF and DF may be more informative. Our observation shows that IDF has little ability of category recognization and contribute a little to text when used by itself only. It seems that the novelty or expressive force of a feature can be formulated as the linear combination of IDF, average occuring frequence and it`s standard deviation.
Keywords :
text analysis; IDF; Zipf law; category recognization; document frequence; occuring frequence; text classification dataset; CHI-Square; IDF; IG; Zipf law; odd;
Conference_Titel :
Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on
Conference_Location :
Chengdu
Print_ISBN :
978-1-4244-5537-9
DOI :
10.1109/ICCSIT.2010.5565078