• DocumentCode
    2769858
  • Title

    A Mandarin lecture speech transcription system for speech summarization

  • Author

    Chan, Ho Yin ; Zhang, Justin Jian ; Fung, Pascale ; Cao, Lu

  • Author_Institution
    Univ. of Sci. & Technol., Hong Kong
  • fYear
    2007
  • fDate
    9-13 Dec. 2007
  • Firstpage
    467
  • Lastpage
    471
  • Abstract
    This paper introduces our work on mandarin lecture speech transcription. In particular, we present our work on a small database, which contains only 16 hours of audio data and 0.16 M words of text data. A range of experiments have been done to improve the performances of the acoustic model and the language model, these include adapting the lecture speech data to the reading speech data for acoustic modeling and the use of lecture conference paper, power points and similar domain web data for language modeling. We also study the effects of automatic segmentation, unsupervised acoustic model adaptation and language model adaptation in our recognition system. By using a 3timesRT multiple passes decoding strategy, we obtain 70.3% accuracy performance in our final system. Finally, we apply our speech transcription system into a SVM summarizer and obtain a ROUGE-L F-measure of 66.5%.
  • Keywords
    natural language processing; speech recognition; Mandarin lecture speech transcription system; automatic segmentation; language modeling; multiple passes decoding strategy; recognition system; speech summarization; Adaptation model; Audio databases; Decoding; Error analysis; Humans; Natural languages; Power system modeling; Speech recognition; Testing; Training data; lecture speech transcription; model adaptation; multi-pass decoding; speech summarization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on
  • Conference_Location
    Kyoto
  • Print_ISBN
    978-1-4244-1746-9
  • Electronic_ISBN
    978-1-4244-1746-9
  • Type

    conf

  • DOI
    10.1109/ASRU.2007.4430157
  • Filename
    4430157