Title :
A recurrent neural network language modeling framework for extractive speech summarization
Author :
Kuan-Yu Chen ; Shih-Hung Liu ; Chen, Bing ; Hsin-Min Wang ; Wen-Lion Hsu ; Hsin-Hsi Chen
Author_Institution :
Inst. of Inf. Sci., Acad. Sinica, Taipei, Taiwan
Abstract :
Extractive speech summarization, with the purpose of automatically selecting a set of representative sentences from a spoken document so as to concisely express the most important theme of the document, has been an active area of research and development. A recent school of thought is to employ the language modeling (LM) approach for important sentence selection, which has proven to be effective for performing speech summarization in an unsupervised fashion. However, one of the major challenges facing the LM approach is how to formulate the sentence models and accurately estimate their parameters for each spoken document to be summarized. This paper presents a continuation of this general line of research and its contribution is two-fold. First, we propose a novel and effective recurrent neural network language modeling (RNNLM) framework for speech summarization, on top of which the deduced sentence models are able to render not only word usage cues but also long-span structural information of word co-occurrence relationships within spoken documents, getting around the need for the strict bag-of-words assumption. Second, the utilities of the method originated from our proposed framework and several widely-used unsupervised methods are analyzed and compared extensively. A series of experiments conducted on a broadcast news summarization task seem to demonstrate the performance merits of our summarization method when compared to several state-of-the-art existing unsupervised methods.
Keywords :
feature extraction; feature selection; parameter estimation; recurrent neural nets; signal representation; speech recognition; word processing; LM approach; RNNLM framework; bag-of-word assumption; broadcast news summarization; extractive speech summarization; long-span structural information; parameter estimation; recurrent neural network language modeling framework; sentence representation; sentence selection model; speech recognition; spoken document summarization; word cooccurrence relationship; word usage cues; Data models; Measurement; Recurrent neural networks; Speech; Speech recognition; Training; Vectors; language modeling; long-span structural information; recurrent neural network; speech summarization;
Conference_Titel :
Multimedia and Expo (ICME), 2014 IEEE International Conference on
Conference_Location :
Chengdu
DOI :
10.1109/ICME.2014.6890220