• DocumentCode
    2056856
  • Title

    Algorithms for modeling distributions over large alphabets

  • Author

    Orlitsky, Alon ; Sajama ; Santhanam, Narayana ; Viswanathan, Krishnamurthy ; Zhang, Junan

  • Author_Institution
    Dept. of Electr. & Comput. Eng., California Univ., San Diego, La Jolla, CA
  • fYear
    2004
  • fDate
    2004
  • Firstpage
    304
  • Lastpage
    304
  • Abstract
    We consider the problem of modeling a distribution whose alphabet size is large relative to the amount of observed data. It is well known that conventional maximum-likelihood estimates do not perform well in that regime. Instead, we find the distribution maximizing the probability of the data´s pattern. We derive an efficient algorithm for approximating this distribution. Simulations show that the computed distribution models the data well and yields general estimators that evaluate various data attributes as well as specific estimators designed especially for these tasks
  • Keywords
    data compression; estimation theory; probability; sequences; alphabet size; data pattern; distribution model; probability; Algorithm design and analysis; Computational modeling; Data engineering; Distributed computing; Frequency; Lagrangian functions; Maximum likelihood estimation; Probability distribution; Testing; Yield estimation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Theory, 2004. ISIT 2004. Proceedings. International Symposium on
  • Conference_Location
    Chicago, IL
  • Print_ISBN
    0-7803-8280-3
  • Type

    conf

  • DOI
    10.1109/ISIT.2004.1365341
  • Filename
    1365341