Automatic segmentation and annotation of audio archive documents

Author

Bohac, Marek ; Blavka, Karel

Author_Institution

Inst. of Inf. Technol. & Electron., Tech. Univ. of Liberec, Liberec, Czech Republic

fYear

2011

fDate

1-3 June 2011

Firstpage

1

Lastpage

6

Abstract

The paper deals with automatic processing of spoken documents from the Czech Radio archive that contains hundreds of thousands of audio recordings. The ultimate goal of the project is to transcribe them and to allow the public access to their content. In this paper, we focus on processing of those documents that have been already transcribed (by humans or in another way) and are to be synchronized (time aligned) with the text. We aim at developing a method that is time efficient and at the same time robust enough to incorrect or incomplete transcriptions. The method is based on the combination of two speech recognition techniques. The first one, a word spotting method searches for selected words in the transcription and proposes points where the document can be split into shorter and homogenous segments covered by the text transcription. For them, we utilize a modified forced-alignment procedure to get time stamps for each word in the transcription. The method runs with 0.5 real-time factor and yields 95.5% word alignment precision. So far, it has been used for transcribing and indexing more than 552 hours of archive recordings.

Keywords

audio signal processing; document handling; speech recognition; Czech Radio archive; audio archive document annotation; audio archive document segmentation; forced-alignment procedure; speech recognition techniques; text transcription; word spotting method; Accuracy; Acoustics; Databases; Hidden Markov models; Reliability; Speech; Vocabulary; audio archive processing; forced alignment; speech recognition; word spotting;

fLanguage

English

Publisher

ieee

Conference_Titel

Electronics, Control, Measurement and Signals (ECMS), 2011 10th International Workshop on

Conference_Location

Liberec

Print_ISBN

978-1-61284-397-1

Electronic_ISBN

978-1-61284-396-4

Type

conf

DOI

10.1109/IWECMS.2011.5952373

Filename

5952373