Author_Institution :
Dept. of Stat., Univ. de Brasilia, Brasília, Brazil
Abstract :
Let D and V denote respectively Information Divergence and Total Variation Distance. Pinsker´s and Vajda´s inequalities are respectively D ≥ [ 1/ 2] V2 and D ≥ log[( 2+V)/( 2-V)] - [( 2V)/( 2+V)]. In this paper, several generalizations and improvements of these inequalities are established for wide classes of <;i>f<;/i>-divergences. First, conditions on f are determined under which an f-divergence Df will satisfy Df ≥ cf V2 or Df ≥ c2,f V2 + c4,f V4, where the constants cf, c2,f and c4,f are best possible. As a consequence, lower bounds in terms of V are obtained for many well known distance and divergence measures, including the χ2 and Hellinger´s discrimination and the families of Tsallis´ and Rényi´s divergences. For instance, if D(α) (P||Q) = [α(α-1)]-1 [∫pαq1-αdμ-1] and ℑα (P||Q) = (α-1)-1 log[∫pαq1-αdμ] are respectively the relative information of type α and the Rényi´s information gain of order α, it is shown that D(α) ≥ [ 1/ 2] V2 + [ 1/ 72] (α+1)(2-α) V4 whenever -1 ≤ α ≤ 2, α ≠ 0,1 and that ℑα ≥ [( α)/ 2] V2 + [ 1/ 36] α(1 + 5 α- 5 α2 ) V4 for 0 <; α <; 1. In a somewhat - - different direction, and motivated by the fact that these Pinsker´s type lower bounds are accurate only for small variation (V close to zero), lower bounds for Df which are accurate for both small and large variation (V close to two) are also obtained. In the special case of the information divergence they imply that D ≥ log[ 2/( 2-V)] - [( 2-V)/2] log[( 2+V)/2], which uniformly improves Vajda´s inequality.
Keywords :
entropy; Csiszár f -divergences; Hellinger discrimination; Pinsker type inequalities; Rényi divergences; Tsallis divergences; Vajda type inequalities; information divergence; lower bounds; relative entropy; total variation distance; Convergence; Convex functions; Entropy; Loss measurement; Polynomials; Probability; Taylor series; Hellinger discrimination; Kullback–Leibler divergence; Rényi\´s information gain; information inequalities; relative entropy; relative information; variational or $L^{1}$ distance;