• DocumentCode
    2606643
  • Title

    Information geometry of adaptive systems

  • Author

    Amari, Shun-Ichi ; Ozeki, Tomoko ; Park, Hyeyoung

  • Author_Institution
    RIKEN, Inst. of Phys. & Chem. Res., Saitama, Japan
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    12
  • Lastpage
    17
  • Abstract
    An adaptive system works in a stochastic environment so that its behavior is represented by a probability distribution, e.g., a conditional probability density of the output conditioned on the input. Information geometry is a powerful tool to study the intrinsic geometry of parameter spaces related to probability distributions. The article investigates the local Riemannian metric and topological singular structures of parameter spaces of hierarchical systems such as multilayer perceptrons. The natural gradient learning method is introduced to the system, which has an idealistic dynamical behavior of learning, which is free of plateau phenomena of learning. We explain the reason from the topological structures of singularities existing in hierarchical systems. We mostly use multilayer perceptrons as examples, but the geometrical structure is common to many hierarchical systems such as Gaussian mixtures of density functions and ARMA models of time series. The singularities are ubiquitous in a hierarchical system. The Fisher information metric degenerates and estimators of parameters are not subject to a Gaussian at singularities. This implies that the Cramer-Rao paradigm does not hold. Model selection is an important subject in hierarchical systems. However, the Cramer-Rao paradigm is used to derive model selection criteria such as AIC and MDL. This study requests further modification of these criteria. This study is a first step to analyze the singular structures of the parameter space and its relation to dynamical behavior of learning
  • Keywords
    Gaussian processes; adaptive systems; autoregressive moving average processes; hierarchical systems; information theory; learning systems; multilayer perceptrons; parameter space methods; probability; time series; AIC; ARMA models; Cramer-Rao paradigm; Fisher information metric; Gaussian mixtures; MDL; adaptive systems; conditional probability density; density functions; dynamical learning behavior; hierarchical systems; information geometry; local Riemannian metric; model selection; multilayer perceptrons; natural gradient learning method; parameter spaces; probability distribution; stochastic environment; time series; topological singular structures; topological structures; Adaptive systems; Density functional theory; Extraterrestrial measurements; Hierarchical systems; Information geometry; Learning systems; Multilayer perceptrons; Probability distribution; Solid modeling; Stochastic systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Adaptive Systems for Signal Processing, Communications, and Control Symposium 2000. AS-SPCC. The IEEE 2000
  • Conference_Location
    Lake Louise, Alta.
  • Print_ISBN
    0-7803-5800-7
  • Type

    conf

  • DOI
    10.1109/ASSPCC.2000.882438
  • Filename
    882438