Title :
Automatic segmentation and annotation of audio archive documents
Author :
Bohac, Marek ; Blavka, Karel
Author_Institution :
Inst. of Inf. Technol. & Electron., Tech. Univ. of Liberec, Liberec, Czech Republic
Abstract :
The paper deals with automatic processing of spoken documents from the Czech Radio archive that contains hundreds of thousands of audio recordings. The ultimate goal of the project is to transcribe them and to allow the public access to their content. In this paper, we focus on processing of those documents that have been already transcribed (by humans or in another way) and are to be synchronized (time aligned) with the text. We aim at developing a method that is time efficient and at the same time robust enough to incorrect or incomplete transcriptions. The method is based on the combination of two speech recognition techniques. The first one, a word spotting method searches for selected words in the transcription and proposes points where the document can be split into shorter and homogenous segments covered by the text transcription. For them, we utilize a modified forced-alignment procedure to get time stamps for each word in the transcription. The method runs with 0.5 real-time factor and yields 95.5% word alignment precision. So far, it has been used for transcribing and indexing more than 552 hours of archive recordings.
Keywords :
audio signal processing; document handling; speech recognition; Czech Radio archive; audio archive document annotation; audio archive document segmentation; forced-alignment procedure; speech recognition techniques; text transcription; word spotting method; Accuracy; Acoustics; Databases; Hidden Markov models; Reliability; Speech; Vocabulary; audio archive processing; forced alignment; speech recognition; word spotting;
Conference_Titel :
Electronics, Control, Measurement and Signals (ECMS), 2011 10th International Workshop on
Conference_Location :
Liberec
Print_ISBN :
978-1-61284-397-1
Electronic_ISBN :
978-1-61284-396-4
DOI :
10.1109/IWECMS.2011.5952373