DocumentCode
1830320
Title
Improving the Transcription of Academic Lectures for Information Retrieval
Author
Mbogho, Audrey ; Marquard, Stephen
Author_Institution
Dept. of Comput. Sci., Univ. of Cape Town, Cape Town, South Africa
Volume
2
fYear
2013
fDate
4-7 Dec. 2013
Firstpage
560
Lastpage
567
Abstract
Recording university lectures through lecture capture systems is increasingly common, generating large amounts of audio and video data. Transcribing recordings greatly enhances their usefulness by making them easy to search. However, the number of recordings accumulates rapidly, rendering manual transcription impractical. Automatic transcription, on the other hand, suffers from low levels of accuracy, partly due to the special language of academic disciplines, which standard language models do not cover. This paper looks into the use of Wikipedia to dynamically adapt language models for scholarly speech. We propose Ranked Word Correct Rate as a new metric better aligned with the goals of improving transcript search ability and specialist word recognition. The study shows that, while overall transcription accuracy may remain low, targeted language modelling can substantially improve search ability, an important goal in its own right.
Keywords
Web sites; educational technology; further education; information retrieval; Wikipedia; academic disciplines; academic lecture transcription; audio data; automatic transcription; information retrieval; lecture capture systems; ranked word correct rate; search ability; standard language models; university lectures; video data; word recognition; Accuracy; Adaptation models; Crawlers; Electronic publishing; Encyclopedias; Internet;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications (ICMLA), 2013 12th International Conference on
Conference_Location
Miami, FL
Type
conf
DOI
10.1109/ICMLA.2013.177
Filename
6786171
Link To Document