Title :
Automatic audio indexing alignment for Thai broadcast news
Author :
Tantibundhit, C. ; Jarasboonpaisan, T. ; Natenee, A. ; Thatphithakkul, N. ; Saykhum, K.
Author_Institution :
MedIntelligence & Innovation Lab., Thammasat Univ., Pathumthani, Thailand
Abstract :
We compare the recognition rate of three language models (LM)-large vocabulary continuous speech recognition (LVCSR), interpolated LVCSR, and N-gram, respectively-for automatic audio indexing alignment for Thai broadcast news. Fifty news clips across ten news categories were collected from MCOT. The audio clips are retrieved and used as the input to those three recognition systems. The recognized words are compared with the available original transcription. The experimental results show that the N-gram gives highest percentage of word correction (without regard to time alignment), followed by the interpolated LVCSR , and the LVCSR, i.e., 68.55%, 43.94%, and 31.24%, respectively. When considering time alignment of words correctly recognized at 0.10 sec error alignment, the N-gram gives highest percent word correction with 60.56%, followed by the interpolated LVCSR with 38.59%, and LVCSR with 27.29%, respectively. Word landmark technique is manipulated to align words incorrectly recognized and can improve the alignment to 89.60% for the N-gram, 83.15% for the interpolated LVCSR, and 67.86% for the LVCSR at 0.10 sec error alignment, respectively.
Keywords :
audio signal processing; indexing; natural language processing; speech recognition; vocabulary; word processing; N-gram; automatic audio indexing alignment; broadcast news; error alignment; interpolated large vocabulary continuous speech recognition; language models; recognition system; time alignment; word correction; word landmark; Automatic speech recognition; Broadcast technology; Indexing; Multimedia communication; Natural languages; Speech recognition; Streaming media; TV broadcasting; Technological innovation; Vocabulary; LVCSR; N-gram; audio indexing alignment; broadcast news; interpolated LVCSR; word landmark;
Conference_Titel :
Electrical Engineering/Electronics Computer Telecommunications and Information Technology (ECTI-CON), 2010 International Conference on
Conference_Location :
Chaing Mai
Print_ISBN :
978-1-4244-5606-2
Electronic_ISBN :
978-1-4244-5607-9