The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter

Author

Castelli, Vittori ; Cover, Thomas M.

Author_Institution

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

Volume

42

Issue

6

fYear

1996

fDate

11/1/1996 12:00:00 AM

Firstpage

2102

Lastpage

2117

Abstract

We observe a training set Q composed of l labeled samples {(X₁,θ₁),...,(X_l, θ_l)} and u unlabeled samples {X₁´,...,X_u´}. The labels θ_i are independent random variables satisfying Pr{θ_i=1}=η, Pr{θ_i=2}=1-η. The labeled observations X_i are independently distributed with conditional density f_θi(·) given θ_i. Let (X₀,θ₀) be a new sample, independently distributed as the samples in the training set. We observe X₀ and we wish to infer the classification θ₀. In this paper we first assume that the distributions f₁(·) and f₂(·) are given and that the mixing parameter is unknown. We show that the relative value of labeled and unlabeled samples in reducing the risk of optimal classifiers is the ratio of the Fisher informations they carry about the parameter η. We then assume that two densities g₁(·) and g₂(·) are given, but we do not know whether g₁(·)=f₁(·) and g₂(·)=f₂(·) or if the opposite holds, nor do we know η. Thus the learning problem consists of both estimating the optimum partition of the observation space and assigning the classifications to the decision regions. Here, we show that labeled samples are necessary to construct a classification rule and that they are exponentially more valuable than unlabeled samples

Keywords

decision theory; estimation theory; learning (artificial intelligence); pattern classification; random processes; Fisher information; classification rule; conditional density; decision region; independent random variables; independently distributed samples; labeled samples; learning problem; observation space; optimal classifiers; optimum partition; pattern recognition; training set; unknown mixing parameter; unlabeled samples; Random variables; Time of arrival estimation;

fLanguage

English

Journal_Title

Information Theory, IEEE Transactions on

Publisher

ieee

ISSN

0018-9448

Type

jour

DOI

10.1109/18.556600

Filename

556600