DocumentCode
2753807
Title
A Novel Approach for Refinement of Corpus in the Field of Opinion Mining
Author
Bhattacharyya, Debnath ; Das, Poulami ; Mitra, Kheyali ; Ganguly, Debashis ; Mukherjee, Swarnendu ; Bandyopadhyay, S.K. ; Kim, Tai-Hoon
Author_Institution
Comput. Sci. & Eng. Dept., Heritage Inst. of Technol., Kolkata, India
fYear
2009
fDate
7-9 March 2009
Firstpage
281
Lastpage
285
Abstract
In this paper, we have provided a heuristic approach for the refinements of corpus based on regular expressions and its possible applications in the field of Opinion Mining. The proposed work is based on a corpus of reviews. The crude corpus is the raw html files containing reviews. This html file is refined further for the ease of our work so that we can get only the required part from that page. The ultimate output yields the xml files which will precisely store the important parts of the review pages from that refined html page. And that is going to be fed to the further process of language processing for machine learning process in the field of Opinion Mining.
Keywords
XML; data mining; hypermedia markup languages; learning (artificial intelligence); natural language processing; HTML files; XML files; corpus refinement; crude corpus; language processing; machine learning process; opinion mining field; review corpus; Application software; Computer science; Frequency; HTML; Humans; Learning systems; Machine learning; Natural language processing; Natural languages; Speech; Corpus; crude corpus; natural language processing; regular expression;
fLanguage
English
Publisher
ieee
Conference_Titel
Future Networks, 2009 International Conference on
Conference_Location
Bangkok
Print_ISBN
978-0-7695-3567-8
Type
conf
DOI
10.1109/ICFN.2009.24
Filename
5189944
Link To Document