Abstract :
Speech summarization technology, which extracts important information and removes irrelevant information from speech, is expected to play an important role in building speech archives and improving the efficiency of spoken document retrieval. However, speech summarization has a number of significant challenges that distinguish it from general text summarization. Fundamental problems with speech summarization include speech recognition errors, disfluencies, and difficulties of sentence segmentation. Typical speech summarization systems consist of speech recognition, sentence segmentation, sentence extraction, and sentence compaction components. Most of the research has focuses on sentence extraction, using LSA (latent semantic analysis), MMR (maximal marginal relevance), or feature-based approaches, among which no decisive method has yet been found. Proper sentence segmentation is also essential to achieve good summarization performance. How to objectively evaluate speech summarization results is an important issue. Several measures, including families of SumACCY and ROUGE measures, have been proposed, and correlation analyses between subjective and objective evaluation scores have been performed. Although these measures are useful for ranking various summarization methods, they do not correlate well with human evaluations, especially when spontaneous speech is targeted.
Keywords :
natural language processing; speech recognition; automatic speech summarization; disfluencies; feature-based approaches; latent semantic analysis; maximal marginal relevance; sentence compaction components; sentence extraction; sentence segmentation; speech archives; speech recognition errors; spoken document retrieval; Broadcasting; Compaction; Computer science; Data mining; Information retrieval; Performance analysis; Speech analysis; Speech recognition; Speech synthesis; Synthesizers;