Title :
Subjectivity classification and analysis of the ASRS corpus
Author :
Switzer, Jason ; Khan, Latifur ; Muhaya, Fahad Bin
Author_Institution :
Univ. of Texas at Dallas, Dallas, TX, USA
Abstract :
Semantic analysis of corpora containing heavy usage of jargon words and phrases introduces problems not commonly addressed by Natural Language Processing methods. Modern semantic analysis relies on data from unedited websites or other expertly written sources, which lack similar usage of jargon words and phrases. This paper presents a system of semi-supervised lexicon learning algorithms that collate several manually labeled and clustered data sources, such as thesauri. In addition, this paper demonstrates an improvement in performance of these subjectivity classifiers by applying a boosting method. This paper presents a method of automatic Aviation Safety Reporting System (ASRS) shaping factor classification based on the most relevant words from a subjectivity lexicon.
Keywords :
learning (artificial intelligence); natural language processing; pattern classification; text analysis; word processing; ASRS corpus analysis; ASRS corpus classification; automatic aviation safety reporting system shaping factor classification; boosting method; corpora semantic analysis; jargon words; natural language processing methods; semisupervised lexicon learning algorithms; unedited Web sites; Boosting; Classification algorithms; Equations; Humans; Natural language processing; Speech recognition; Thesauri; Annotation; Classification; Sentiment; Text mining;
Conference_Titel :
Information Reuse and Integration (IRI), 2011 IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4577-0964-7
Electronic_ISBN :
978-1-4577-0965-4
DOI :
10.1109/IRI.2011.6009539