• DocumentCode
    2314535
  • Title

    Automatic Thai audio/transcription segmentation

  • Author

    Santiklang, Chatree ; Wutiwiwatchai, Chai ; Boonpramuk, Panuthat

  • Author_Institution
    Control Syst. & Instrum. Eng., King Mongkut Univ. of Technol. Thonburi, Bangkok
  • fYear
    2009
  • fDate
    6-9 May 2009
  • Firstpage
    1018
  • Lastpage
    1021
  • Abstract
    This paper proposes an automatic algorithm for segmenting audio and transcription streams to be used in constructing a large vocabulary continuous speech recognition (LVCSR) system. In many cases, LVCSR training data are derived from audio materials with available transcriptions such as preach and news articles. In Thai, these resources are usually in the form of long wave files with their corresponding text articles written with no explicit word nor sentence separation, which is crucial for acoustic and language model training in LVCSR. The proposed algorithm segments a large wave file into small utterances using energy detection. The transcription is then aligned to each utterance using dynamic time warping (DTW) combined with a classification and regression tree (CART) confidence measure over a phone basis. An evaluation shows that the DTW alignment procedure still requires an improvement while the CART confidence measure achieves a promising result.
  • Keywords
    regression analysis; speech recognition; trees (mathematics); CART confidence measure; LVCSR training data; automatic Thai audio segmentation; dynamic time warping; energy detection; language model training; large vocabulary continuous speech recognition system; long wave files; regression tree; sentence separation; transcription segmentation; Acoustic measurements; Acoustic signal detection; Acoustic waves; Classification tree analysis; Regression tree analysis; Speech recognition; Streaming media; Time measurement; Training data; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, 2009. ECTI-CON 2009. 6th International Conference on
  • Conference_Location
    Pattaya, Chonburi
  • Print_ISBN
    978-1-4244-3387-2
  • Electronic_ISBN
    978-1-4244-3388-9
  • Type

    conf

  • DOI
    10.1109/ECTICON.2009.5137218
  • Filename
    5137218