• DocumentCode
    118102
  • Title

    An event-related brain potential study on the impact of speech recognition errors

  • Author

    Sakti, Sakriani ; Odagaki, Yu. ; Sasakura, Takafumi ; Neubig, Graham ; Toda, Tomoki ; Nakamura, Satoshi

  • Author_Institution
    Grad. Sch. of Inf. Sci., Nara Inst. of Sci. & Technol., Nara, Japan
  • fYear
    2014
  • fDate
    9-12 Dec. 2014
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Most automatic speech recognition (ASR) systems, which aim for perfect transcription of utterances, are trained and tuned by minimizing the word error rate (WER). In this framework, even though the impact of all errors is not the same, all errors (substitutions, deletions, insertions) from any words are treated in a uniform manner. The size of the impact and exactly what the differences are remain unknown. Several studies have proposed possible alternatives to the WER metric. But no analysis has investigated how the human brain processes language and perceives the effect of mistaken output by ASR systems. In this research we utilize event-related brain potential (ERP) studies and directly analyze the brain activities on the impact of ASR errors. Our results reveal that the peak amplitudes of the positive shift after the substitution and deletion violations are much bigger than the insertion violations. This finding indicates that humans perceived each error differently based on its impact of the whole sentence. To investigate the effect of this study, we formulated a new weighted word error rate metric based on the ERP results: ERP-WWER. We re-evaluated the ASR performance using the new ERP-WWER metric and compared and discussed the results with the standard WER.
  • Keywords
    brain; error analysis; speech recognition; ASR; ERP-WWER metric; automatic speech recognition; brain activities; deletion violations; event-related brain potential; human brain process language; peak amplitudes; positive shift; speech recognition errors; substitution violations; weighted word error rate metric; Brain; Measurement; Semantics; Speech; Speech recognition; Standards; Syntactics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA)
  • Conference_Location
    Siem Reap
  • Type

    conf

  • DOI
    10.1109/APSIPA.2014.7041620
  • Filename
    7041620