Title :
Representation method for a set of documents from the viewpoint of Bayesian statistics
Author :
Goto, Masayuki ; Ishida, Takashi ; Hirasawa, Shigeichi
Author_Institution :
Fac. of Environ. & Information Studies, Musashi Inst. of Technol., Japan
Abstract :
In this paper, we consider the Bayesian approach for representation of a set of documents. In the field of representation of a set of documents, many previous models, such as the latent semantic analysis (LSA), the probabilistic latent semantic analysis (PLSA), the semantic aggregate model (SAM), the Bayesian latent semantic analysis (BLSA), and so on, were proposed. In this paper, we formulate the Bayes optimal solutions for estimation of parameters and selection of the dimension of the hidden latent class in these models and analyze it´s asymptotic properties.
Keywords :
Bayes methods; belief networks; indexing; information retrieval; parameter estimation; semantic networks; Bayes optimal solutions; Bayesian latent semantic analysis; asymptotic properties; hidden latent class; parameter estimation; probabilistic latent semantic analysis; representation method; semantic aggregate model; Aggregates; Bayesian methods; Databases; Hidden Markov models; Indexes; Indexing; Information retrieval; Large scale integration; Parameter estimation; Statistics;
Conference_Titel :
Systems, Man and Cybernetics, 2003. IEEE International Conference on
Print_ISBN :
0-7803-7952-7
DOI :
10.1109/ICSMC.2003.1245715