Title :
Topic identification based extrinsic evaluation of summarization techniques applied to conversational speech
Author :
Harwath, David ; Hazen, Timothy J.
Author_Institution :
MIT Lincoln Lab., Lexington, MA, USA
Abstract :
Document summarization algorithms are most commonly evaluated according to the intrinsic quality of the summaries they produce. An alternate approach is to examine the extrinsic utility of a summary, measured by the ability of the summary to aid a human in the completion of a specific task. In this paper, we use topic identification as a proxy for relevancy determination in the context of an information retrieval task, and a summary is deemed effective if it enables a user to determine the topical content of a retrieved document. We utilize Amazon´s Mechanical Turk service to perform a large-scale human study contrasting four different summarization systems applied to conversational speech from the Fisher Corpus. We show that these results appear to be correlated with the performance of an automated topic identification system, and argue that this automated system can act as a low-cost proxy for a human evaluation during the development stages of a summarization system.
Keywords :
information retrieval; speech processing; Amazon mechanical turk service; Fisher corpus; automated system; conversational speech; document retrieval; document summarization algorithms; extrinsic utility; human evaluation; information retrieval task; intrinsic quality; large-scale human study; low-cost proxy; relevancy determination; topic identification based extrinsic evaluation; topical content; Computational modeling; Context; Error analysis; Humans; Probabilistic logic; Speech; Vectors; Document Summarization; Topic Modeling;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6289061