Title :
Retrieval of Personal Web Documents by Extracting Subjective Expressions
Author :
Hayashi, Takahiro ; Abe, Koji ; Onai, Rikio
Author_Institution :
Univ. of Electro-Commun., Tokyo
Abstract :
This paper presents a method for gathering Japanese Web documents which contain personal opinions. Our method is available as a pre-processing of applications for mining various opinions. In order to find personal documents on the Web, we focus on four kinds of subjective expressions: (1) negative meaning expressions, (2) final particles, (3) interjections, and (4) specific symbols such as face marks. Measuring the frequencies of these subjective expressions in a document, our method classifies Web documents into personal and non-personal ones. Besides, our method gives the documents scores which show the accuracy of the classification results. We experimentally confirmed the effectiveness of the proposal using 1200 Web documents. The experimental results have shown the precision and recall of the proposed classification are 0.70 and 0.87, respectively. In addition, we have confirmed that personal documents can be easily obtained by gathering documents which are given high scores.
Keywords :
Internet; classification; data mining; information retrieval; text analysis; Japanese Web documents; documents scoring; face marks; final particles; interjections; negative meaning expressions; opinion mining; personal Web document retrieval; personal opinions; subjective expression extraction; Application software; Blogs; Computer science; Data mining; Frequency measurement; Informatics; Information retrieval; Proposals; Search engines; Technology planning; Personal Web Pages; Subjective Expressions; WWW;
Conference_Titel :
Advanced Information Networking and Applications - Workshops, 2008. AINAW 2008. 22nd International Conference on
Conference_Location :
Okinawa
Print_ISBN :
978-0-7695-3096-3
DOI :
10.1109/WAINA.2008.52