DocumentCode
3433495
Title
Semisupervised domain adaptation for mixture model based classifiers
Author
Raghuram, Jayaram ; Miller, David J. ; Kesidis, George
Author_Institution
Dept. of EE, Pennsylvania State Univ., University Park, PA, USA
fYear
2012
fDate
21-23 March 2012
Firstpage
1
Lastpage
6
Abstract
This paper introduces a method for mixture model-based classifier domain adaptation, wherein one has adequate labeled training data for one (source) domain, very scarce labeled data for another (target) domain, and where the discrepancy between the source and target domain class-conditional distributions is not “too great”. Starting from the source domain classifier parameters, the method maximizes the likelihood of target domain data, while constrained to agree as much as possible with the target domain label information. This is achieved via an expectation maximization (EM) algorithm, where the joint distribution of the latent variables in the E-Step is parametrically constrained, in order to ensure space-partitioning implications are gleaned from the labeled target domain samples. Experiments on publicly available Internet packet-flow traffic data from different temporal and spatial domains demonstrate significant gains in classification performance compared to 1. direct porting of the source domain classifier; 2. semisupervised learning using only the target domain data; and 3. extension of an existing unsupervised domain adaptation method.
Keywords
Internet; expectation-maximisation algorithm; learning (artificial intelligence); pattern classification; telecommunication traffic; E-Step; EM algorithm; Internet packet flow traffic data; classification performance gain; expectation maximization algorithm; labeled training data; latent variables; mixture model based classifiers; semisupervised domain adaptation; semisupervised learning; source domain class-conditional distribution; source domain classifier parameters; space-partitioning; spatial domain; target domain class-conditional distribution; target domain data; target domain label information; temporal domain; unsupervised domain adaptation method; Accuracy; Adaptation models; Classification algorithms; Data models; Histograms; Semantics; Spatial databases; Internet traffic classification; constrained maximum likelihood; domain adaptation; mixture discriminant analysis; semisupervised learning; space-partitioning;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Sciences and Systems (CISS), 2012 46th Annual Conference on
Conference_Location
Princeton, NJ
Print_ISBN
978-1-4673-3139-5
Electronic_ISBN
978-1-4673-3138-8
Type
conf
DOI
10.1109/CISS.2012.6310708
Filename
6310708
Link To Document