مرکز منطقه ای اطلاع رساني علوم و فناوري - Analyzing quality of crowd-sourced speech transcriptions of noisy audio for acoustic model adaptation

DocumentCode :

3162067

Title :

Analyzing quality of crowd-sourced speech transcriptions of noisy audio for acoustic model adaptation

Author :

Audhkhasi, Kartik ; Georgiou, Panayiotis G. ; Narayanan, Shrikanth S.

Author_Institution :

Signal Anal. & Interpretation Lab. (SAIL), Univ. of Southern California, Los Angeles, CA, USA

fYear :

2012

fDate :

25-30 March 2012

Firstpage :

4137

Lastpage :

4140

Abstract :

The accuracy of crowd-sourced speech transcriptions varies depending on a variety of factors. This paper studies the impact of one such factor, namely, the quality of audio. We employed a speech database with babble noise at three SNR levels (clean, 2 dB and -2 dB) and asked workers on Amazon Mechanical Turk to transcribe it. Two interesting observations emerge. First, as expected, the quality of transcripts combined by word frequency based ROVER decreases with decreasing SNR. Further, we demonstrate that the use of some unsupervised reliability scores can improve the transcription quality, with increasing benefits at lower SNR. Second, we do not observe a significant drop in the performance of acoustic models adapted with increasing transcription noise. This highlights the surprising robustness of crowd-sourced transcripts for acoustic model adaptation.

Keywords :

reliability; speech recognition; Amazon Mechanical Turk; ROVER; SNR levels; acoustic model adaptation; audio quality; babble noise; crowd-sourced speech transcriptions; noisy audio; speech database; unsupervised reliability; Acoustics; Adaptation models; Error analysis; Noise; Noise measurement; Reliability; Speech; Crowd-sourcing; automatic speech recognition; speech transcription;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location :

Kyoto

ISSN :

1520-6149

Print_ISBN :

978-1-4673-0045-2

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2012.6288829

Filename :

6288829

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3162067