Title :
Minimum description length principle for maximum entropy model selection
Author :
Pandey, G.K. ; Dukkipati, Ambedkar
Author_Institution :
Dept. of Comput. Sci. & Autom., Indian Inst. of Sci., Bangalore, India
Abstract :
In maximum entropy method, one chooses a distribution from a set of distributions that maximizes the Shannon entropy for making inference from incomplete information. There are various ways to specify this set of distributions, the important special case being when this set is described by mean-value constraints of some feature functions. In this case, maximum entropy method fixes an exponential distribution depending on the feature functions that have to be chosen a priori. In this paper, we treat the problem of selecting a maximum entropy model given various feature subsets and their moments, as a model selection problem, and present a minimum description length (MDL) formulation to solve this problem. For this, we derive normalized maximum likelihood (NML) code-length for these models. Furthermore, we show that the minimax entropy method is a special case of maximum entropy model selection, where one assumes that complexity of all the models are equal. We extend our approach to discriminative maximum entropy models. We apply our approach to gene selection problem to select the number of moments for each gene for fixing the model.
Keywords :
codes; exponential distribution; maximum entropy methods; maximum likelihood estimation; MDL formulation; NML code-length; Shannon entropy; discriminative maximum entropy model; maximum entropy model selection problem; mean-value constraint; minimum description length; minimum description length principle; normalized maximum likelihood; Complexity theory; Computational modeling; Data models; Entropy; Information theory; Mathematical model; Probability distribution;
Conference_Titel :
Information Theory Proceedings (ISIT), 2013 IEEE International Symposium on
Conference_Location :
Istanbul
DOI :
10.1109/ISIT.2013.6620481