DocumentCode :
2769020
Title :
Spoken document summarization using relevant information
Author :
Chen, Yi-Ting ; Lin, Shih-Hsiang ; Wang, Hsin-Min ; Chen, Berlin
Author_Institution :
Acad. Sinica, Taipei
fYear :
2007
fDate :
9-13 Dec. 2007
Firstpage :
189
Lastpage :
194
Abstract :
Extractive summarization usually automatically selects indicative sentences from a document according to a certain target summarization ratio, and then sequences them to form a summary. In this paper, we investigate the use of information from relevant documents retrieved from a contemporary text collection for each sentence of a spoken document to be summarized in a probabilistic generative framework for extractive spoken document summarization. In the proposed methods, the probability of a document being generated by a sentence is modeled by a hidden Markov model (HMM), while the retrieved relevant text documents are used to estimate the HMM´s parameters and the sentence´s prior probability. The results of experiments on Chinese broadcast news compiled in Taiwan show that the new methods outperform the previous HMM approach.
Keywords :
hidden Markov models; information retrieval; probability; speech processing; text analysis; contemporary text collection; extractive summarization; hidden Markov model; indicative sentences; information retrieval; probabilistic generative model; spoken document summarization; text document; Data mining; Hidden Markov models; Information resources; Information retrieval; Information science; Multimedia communication; Parameter estimation; Speech; Support vector machine classification; Support vector machines; extractive summarization; hidden Markov model; probabilistic generative model; relevance model; relevant document; spoken document; summarization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-1746-9
Electronic_ISBN :
978-1-4244-1746-9
Type :
conf
DOI :
10.1109/ASRU.2007.4430107
Filename :
4430107
Link To Document :
بازگشت