Title :
The Cambridge University spoken document retrieval system
Author :
Johnson, S.E. ; Jourlin, P. ; Moore, G.L. ; Jones, K. Spärck ; Woodland, Philip C.
Author_Institution :
Dept. of Eng., Cambridge Univ., UK
Abstract :
This paper describes the spoken document retrieval system that we have been developing and assesses its performance using automatic transcriptions of about 50 hours of broadcast news data. The recognition engine is based on the HTK broadcast news transcription system and the retrieval engine is based on the techniques developed at City University. The retrieval performance over a wide range of speech transcription error rates is presented and a number of recognition error metrics that more accurately reflect the impact of transcription errors on retrieval accuracy are defined and computed. The results demonstrate the importance of high accuracy automatic transcription. The final system is currently being evaluated on the 1998 TREC-7 spoken document retrieval task
Keywords :
broadcasting; information retrieval; natural language interfaces; search engines; speech recognition; Cambridge University; City University; HTK broadcast news transcription system; TREC-7 spoken document retrieval task; broadcast news data; high accuracy automatic transcription; recognition engine; recognition error metrics; retrieval accuracy; retrieval engine; retrieval performance; speech transcription error rates; spoken document retrieval system; Automatic speech recognition; Error analysis; Information retrieval; Internet; Laboratories; Radio broadcasting; Search engines; Speech recognition; System testing; TV broadcasting;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
Conference_Location :
Phoenix, AZ
Print_ISBN :
0-7803-5041-3
DOI :
10.1109/ICASSP.1999.758059