DocumentCode
2888121
Title
Domain Adaptation Using Domain Similarity- and Domain Complexity-Based Instance Selection for Cross-Domain Sentiment Analysis
Author
Remus, R.
Author_Institution
Dept. of Comput. Sci., Univ. of Leipzig, Leipzig, Germany
fYear
2012
fDate
10-10 Dec. 2012
Firstpage
717
Lastpage
723
Abstract
We propose an approach to domain adaptation that selects instances from a source domain training set, which are most similar to a target domain. The factor by which the original source domain training set size is reduced is determined automatically by measuring domain similarity between source and target domain as well as their domain complexity variance. Domain similarity is measured as divergence between term unigram distributions. Domain complexity is measured as homogeneity, i.e. self-similarity. We evaluate our approach in a semi-supervised cross-domain document-level polarity classification experiment. Thereby we show, that it yields small but statistically significant improvements over several natural baselines and achieves results competitive to other state-of-the-art domain adaptation schemes.
Keywords
document handling; pattern classification; cross-domain sentiment analysis; domain adaptation scheme; domain complexity-based instance selection; domain similarity-based instance selection; semisupervised cross-domain document-level polarity classification experiment; source domain training set; target domain; term unigram distributions; Accuracy; Adaptation models; Complexity theory; Computational linguistics; Conferences; Natural language processing; Training; Cross-domain sentiment analysis; Domain adaptation; Domain complexity; Domain similarity; Instance selection; Polarity classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on
Conference_Location
Brussels
Print_ISBN
978-1-4673-5164-5
Type
conf
DOI
10.1109/ICDMW.2012.46
Filename
6406510
Link To Document