مرکز منطقه ای اطلاع رساني علوم و فناوري - Speaker Overlaps and ASR Errors in Meetings: Effects Before, During, and After the Overlap

DocumentCode :

454576

Title :

Speaker Overlaps and ASR Errors in Meetings: Effects Before, During, and After the Overlap

Author :

Çetin, Özgür ; Shriberg, Elizabeth

Author_Institution :

Int. Comput. Sci. Inst., Berkeley, CA

Volume :

fYear :

2006

fDate :

14-19 May 2006

Abstract :

We analyze automatic speech recognition (ASR) errors made by a state-of-the-art meeting recognizer, with respect to locations of overlapping speech. Our analysis focuses on recognition errors made both during an overlap and in the regions immediately preceding and following the location of overlapped speech. We devise an experimental paradigm to allow examination of the same foreground speech both with and without naturally occurring cross-talk. We then analyze ASR errors with respect to a number of factors, including the severity of the cross-talk and distance from the overlap region. In addition to reporting effects on ASR errors, we discover a number of interesting phenomena. First, we find that overlaps tend to occur at high-perplexity regions in the foreground talker´s speech. Second, word sequences within overlaps have higher perplexity than those in nonoverlaps, if using trigrams or 4-grams, but the unigram perplexity within overlaps is considerably lower than that of nonoverlaps. An explanation for this behavior is proposed, based on the preponderance of multiple short dialog acts found in overlap regions. Third, we discover that the word error rate (WER) after overlaps is consistently lower than that before the overlap. This finding cannot be explained by the recognition process itself; rather, the foreground speaker appears to reduce perplexity shortly after being overlapped. Taken together, these observations suggest that the automatic modeling of meetings could benefit from a broader view of the relationship between speaker overlap and ASR in natural conversation

Keywords :

speech recognition; automatic speech recognition; cross-talk; overlapping speech; speaker overlaps; state-of-the-art meeting recognizer; word error rate; word sequences; Automatic speech recognition; Computer errors; Computer science; Error analysis; Loudspeakers; Microphones; NIST; Speech analysis; Speech recognition; Telephony;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on

Conference_Location :

Toulouse

ISSN :

1520-6149

Print_ISBN :

1-4244-0469-X

Type :

conf

DOI :

10.1109/ICASSP.2006.1660031

Filename :

1660031

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=454576