DocumentCode :
2625696
Title :
Structural Parse Tree Features for Text Representation
Author :
Massung, Sean ; Chengxiang Zhai ; Hockenmaier, Julia
Author_Institution :
Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
fYear :
2013
fDate :
16-18 Sept. 2013
Firstpage :
9
Lastpage :
16
Abstract :
We propose and study novel text representation features created from parse tree structures. Unlike the traditional parse tree features which include all the attached syntactic categories to capture linguistic properties of text, the new features are solely or primarily defined based on the tree structure, and thus better reflect the pure structural properties of parse trees. We hypothesize that these new complex structural features capture an orthogonal perspective of text even compared to advanced syntactic ones. Evaluation based on three different text categorization tasks (i.e., nationality detection, essay scoring, and sentiment analysis) shows that the proposed new tree structure features complement the existing ones to enrich text representation. Experiment results further show that a combination of the proposed new structure features with word n-grams can improve F1 score and classification accuracy.
Keywords :
classification; computational linguistics; text analysis; trees (mathematics); F1 score; classification accuracy; complex structural features; essay scoring; linguistic properties; nationality detection; orthogonal perspective; parse tree structures; sentiment analysis; structural parse tree features; structural properties; syntactic categories; text categorization tasks; text representation; word n-grams; Accuracy; Feature extraction; Information retrieval; Production; Skeleton; Syntactics; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Semantic Computing (ICSC), 2013 IEEE Seventh International Conference on
Conference_Location :
Irvine, CA
Type :
conf
DOI :
10.1109/ICSC.2013.13
Filename :
6693488
Link To Document :
بازگشت