• DocumentCode
    752131
  • Title

    An efficient universal prediction algorithm for unknown sources with limited training data

  • Author

    Ziv, Jacob

  • Author_Institution
    Dept. of Electr. Eng., Technion-Israel Inst. of Technol., Haifa, Israel
  • Volume
    48
  • Issue
    6
  • fYear
    2002
  • fDate
    6/1/2002 12:00:00 AM
  • Firstpage
    1690
  • Lastpage
    1693
  • Abstract
    Inspired by C. E. Shannon\´s celebrated paper: "Prediction and entropy of printed English" (1951), we consider the optimal prediction error for unknown finite-alphabet ergodic Markov sources, for prediction algorithms that make inference about the most probable incoming letter, where the distribution of the unknown source is apparent only via a short training sequence of N + 1 letters. We allow N to be a polynomial function of K, the order of the Markov source, rather than the classical case where N is allowed to be exponential in K. A lower bound on the prediction error is formulated for such universal prediction algorithms, that are based on suffixes that were observed somewhere in the past "training sequence" X-N-1 (i.e. it is assumed that the universal predictor, given the past (N + 1)-sequence which serves as a training sequence is no better than the optimal predictor given only the longest suffix that appeared somewhere in the past X-N -1 vector). For a class of stationary Markov sources (which includes all Markov sources with positive transition probabilities), a particular universal predictor is introduced, and it is demonstrated that its performance is "optimal" in the sense that it yields a prediction error which is close to the lower bound on the universal prediction error, with limited training data. The results are nonasymptotic in the sense that they express the effect of limited training data on the efficiency of universal predictors. An asymptotically optimal universal predictor which is based on pattern matching appears elsewhere in the literature. However, the prediction error of these algorithms does not necessarily come close to the lower bound in the nonasymptotic region
  • Keywords
    Markov processes; entropy; pattern matching; polynomials; prediction theory; probability; efficient universal prediction algorithm; entropy; finite-alphabet ergodic Markov sources; limited training data; longest suffix; lower bound; nonasymptotic region; optimal prediction error; pattern matching; polynomial function; positive transition probabilities; printed English; source coding; stationary Markov sources; training sequence; unknown sources; Entropy; Error analysis; Inference algorithms; Jacobian matrices; Pattern matching; Polynomials; Prediction algorithms; Statistics; Training data;
  • fLanguage
    English
  • Journal_Title
    Information Theory, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9448
  • Type

    jour

  • DOI
    10.1109/TIT.2002.1003847
  • Filename
    1003847