Title :
Speaker Overlaps and ASR Errors in Meetings: Effects Before, During, and After the Overlap
Author :
Çetin, Özgür ; Shriberg, Elizabeth
Author_Institution :
Int. Comput. Sci. Inst., Berkeley, CA
Abstract :
We analyze automatic speech recognition (ASR) errors made by a state-of-the-art meeting recognizer, with respect to locations of overlapping speech. Our analysis focuses on recognition errors made both during an overlap and in the regions immediately preceding and following the location of overlapped speech. We devise an experimental paradigm to allow examination of the same foreground speech both with and without naturally occurring cross-talk. We then analyze ASR errors with respect to a number of factors, including the severity of the cross-talk and distance from the overlap region. In addition to reporting effects on ASR errors, we discover a number of interesting phenomena. First, we find that overlaps tend to occur at high-perplexity regions in the foreground talker´s speech. Second, word sequences within overlaps have higher perplexity than those in nonoverlaps, if using trigrams or 4-grams, but the unigram perplexity within overlaps is considerably lower than that of nonoverlaps. An explanation for this behavior is proposed, based on the preponderance of multiple short dialog acts found in overlap regions. Third, we discover that the word error rate (WER) after overlaps is consistently lower than that before the overlap. This finding cannot be explained by the recognition process itself; rather, the foreground speaker appears to reduce perplexity shortly after being overlapped. Taken together, these observations suggest that the automatic modeling of meetings could benefit from a broader view of the relationship between speaker overlap and ASR in natural conversation
Keywords :
speech recognition; automatic speech recognition; cross-talk; overlapping speech; speaker overlaps; state-of-the-art meeting recognizer; word error rate; word sequences; Automatic speech recognition; Computer errors; Computer science; Error analysis; Loudspeakers; Microphones; NIST; Speech analysis; Speech recognition; Telephony;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location :
Toulouse
Print_ISBN :
1-4244-0469-X
DOI :
10.1109/ICASSP.2006.1660031