مرکز منطقه ای اطلاع رساني علوم و فناوري - On Strong Consistency of Model Selection in Classification

DocumentCode :

798915

Title :

On Strong Consistency of Model Selection in Classification

Author :

Suzuki, Joe

Author_Institution :

Dept. of Math., Osaka Univ.

Volume :

Issue :

fYear :

2006

Firstpage :

4767

Lastpage :

4774

Abstract :

This paper considers model selection in classification. In many applications such as pattern recognition, probabilistic inference using a Bayesian network, prediction of the next in a sequence based on a Markov chain, the conditional probability P(Y=y|X=x) of class yisinY given attribute value xisinX is utilized. By model we mean the equivalence relation in X: for x,x´isinXx~x´hArrP(Y=y|X=x)=P(Y=y|X=x´), forall yisinY. By classification we mean the number of such equivalence classes is finite. We estimate the model from n samples zⁿ=(x_i,y_i)_i=1 ⁿisin(XtimesY)ⁿ, using information criteria in the form empirical entropy H plus penalty term (k/2)d_n (the model such that H+(k/2)d_n is minimized is the estimated model), where k is the number of independent parameters in the model, and {d_n}_n=1 ^infin is a real nonnegative sequence such that lim sup_nd_n/n=0. For autoregressive processes, although the definitions of H and k are different, it is known that the estimated model almost surely coincides with the true model as nrarrinfin if {d_n}_n=1 ^infin>{2loglogn}_n=1 ^infin, and that it does not if {d_n}_n=1 ^infin<{2loglogn}_n=1 ^infin (Hannan and Quinn). The problem whether the same property is true for classification was open. This paper solves the problem in the affirmative

Keywords :

Markov processes; autoregressive processes; belief networks; entropy; pattern classification; probability; sequences; Bayesian network; Markov chain; autoregressive process; classification; conditional probability; empirical entropy; information criteria; model selection; sequence; strong consistency; Artificial intelligence; Autoregressive processes; Bayesian methods; Entropy; Intelligent networks; Mathematics; Pattern recognition; Random variables; Statistical learning; Statistics; Error probability; Hannan and Quinn´s procedure; Kullback–Leibler divergence; law of the iterated logarithm; model selection; strong consistency;

fLanguage :

English

Journal_Title :

Information Theory, IEEE Transactions on

Publisher :

ieee

ISSN :

0018-9448

Type :

jour

DOI :

10.1109/TIT.2006.883611

Filename :

1715524

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=798915