• DocumentCode
    3752272
  • Title

    Automatic classification of usability of ASR result for real-time captioning of lectures

  • Author

    Yuya Akita;Nobuhiro Kuwahara;Tatsuya Kawahara

  • Author_Institution
    School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
  • fYear
    2015
  • Firstpage
    19
  • Lastpage
    22
  • Abstract
    As a support to hearing-impaired students in a classroom, real-time captioning and note taking using automatic speech recognition (ASR) have been investigated. However, even with ASR, editing by hand is needed to check and correct recognition errors and redundant spoken expressions in ASR results, and thus it often leads to delay in presenting captions. For efficient edit and quick presentation, we propose an automatic classification of ASR results in terms of usability as caption texts, and a presentation method based on the classification. In this study, we define the usability by syntactic correctness, errors and redundant spoken expressions in ASR results. Based on this definition, each unit of ASR results is classified into "valid," "invalid" or "to be checked," using hand-crafted rules and a machine learning framework. When presenting captions, "valid" input is presented promptly. "To be checked" input is manually edited, and then added to captions. We developed a real-time captioning system by incorporating the automatic classification method and the presentation method, and conducted a trial of this system in a university lecture.
  • Keywords
    "Usability","Delays","Real-time systems","Speech","Data models","Informatics","Automatic speech recognition"
  • Publisher
    ieee
  • Conference_Titel
    Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2015 Asia-Pacific
  • Type

    conf

  • DOI
    10.1109/APSIPA.2015.7415524
  • Filename
    7415524