Title :
Language Feature Mining for Document Subjectivity Analysis
Author :
Chen, Bo ; He, Hui ; Guo, Jun
Abstract :
In recent years, document sentiment analysis has attracted a great deal of research interest. One important aspect of this filed is the subjectivity analysis. This problem is different from traditional text categorization on that more linguistic or semantic information are required for better estimating the subjectivity of a document. Therefore, in this paper, focuses are on how to extract useful and meaningful language features and how to combine all of these language features efficiently. Under the well-known n- gram language model framework, we investigated a series of language-grams having different n-order and various distances to find the most important ones. In addition, we have also tried several weighting methods to make features more meaningful. Based on various kinds of language features, we adopted a tailored Maximum Entropy modeling method to construct our subjectivity classifier. Detailed experiments given in this paper show that the well extracted language features are suit for the document subjectivity analysis task.
Keywords :
Classification tree analysis; Data mining; Entropy; Internet; Machine learning; Machine learning algorithms; Military computing; Motion pictures; Text analysis; Text categorization;
Conference_Titel :
Data, Privacy, and E-Commerce, 2007. ISDPE 2007. The First International Symposium on
Conference_Location :
Chengdu
Print_ISBN :
978-0-7695-3016-1
DOI :
10.1109/ISDPE.2007.105