Author_Institution :
Dept. of Math. & Stat., Univ. of Ottawa, Ottawa, ON, Canada
Abstract :
Exchangeable random variables form an important and well-studied generalization of i.i.d. variables, however simple examples show that no nontrivial concept or function classes are PAC learnable under general exchangeable data inputs X1, X2, .... Inspired by the work of Berti and Rigo on a Glivenko-Cantelli theorem for exchangeable inputs, we propose a new paradigm, adequate for learning from exchangeable data: predictive PAC learnability. A learning rule L for a function class F is predictive PAC if for every ε, δ > 0 and each function f ∈ F, whenever |σ| ≥ s(δ, ε), we have with confidence 1-δ that the expected difference between f(Xn+1) and the image of f|σ under L, does not exceed ε conditionally on X1, X2, ..., Xn. Thus, instead of learning the function f as such, we are learning to a given accuracy ε the predictive behaviour of f at the future points Xi(ω), i > n of the sample path. Using de Finetti´s theorem, we show that if a universally separable function class F is distribution-free PAC learnable under i.i.d. inputs, then it is distribution-free predictive PAC learnable under exchangeable inputs, with a slightly worse sample complexity.
Keywords :
generalisation (artificial intelligence); learning (artificial intelligence); Glivenko-Cantelli theorem; de Finetti theorem; distribution-free PAC learnable; exchangeable input data; exchangeable random variables; learning rule; predictive PAC learnability; Complexity theory; Convergence; Extraterrestrial measurements; Joints; Random variables; Statistical learning; Exchangeable random variables; de Finetti theorem; predictive PAC learnability;