مرکز منطقه ای اطلاع رساني علوم و فناوري - Word Topical Mixture Models for Extractive Spoken Document Summarization

DocumentCode :

3194399

Title :

Word Topical Mixture Models for Extractive Spoken Document Summarization

Author :

Chen, Berlin ; Chen, Yi-Ting

Author_Institution :

Nat. Taiwan Normal Univ., Taipei

fYear :

2007

fDate :

2-5 July 2007

Firstpage :

Lastpage :

Abstract :

This paper considers extractive summarization of Chinese spoken documents. In contrast to conventional approaches, we attempt to deal with the extractive summarization problem under a probabilistic generative framework. A word topical mixture model (w-TMM) was proposed to explore the cooccurrence relationship between words of the language. Each sentence of the spoken document to be summarized was treated as a composite word TMM model for generating the document, and sentences were ranked and selected according to their likelihoods. Various kinds of modeling structures and learning approaches were extensively investigated. In addition, the summarization capabilities were verified by comparison with the other conventional summarization approaches. The experiments were performed on the Chinese broadcast news collected in Taiwan. Noticeable performance gains were obtained. The proposed summarization technique has also been properly integrated into our prototype system for voice retrieval of broadcast news via mobile devices.

Keywords :

document handling; speech processing; Chinese broadcast news; Chinese spoken documents; extractive spoken document summarization; mobile devices; probabilistic generative framework; voice retrieval; word topical mixture models; Broadcasting; Computer science; Data mining; Hidden Markov models; Natural languages; Performance gain; Prototypes; Speech; Support vector machine classification; Support vector machines;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Multimedia and Expo, 2007 IEEE International Conference on

Conference_Location :

Beijing

Print_ISBN :

1-4244-1016-9

Electronic_ISBN :

1-4244-1017-7

Type :

conf

DOI :

10.1109/ICME.2007.4284584

Filename :

4284584

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3194399