Parallel model combination and word recognition in soccer audio

Author

Longton, Jack H. ; Jackson, Philip J B

Author_Institution

Centre for Vision, Speech & Signal Process. (CVSSP), Surrey Univ., Guildford

fYear

2008

fDate

June 23 2008-April 26 2008

Firstpage

1465

Lastpage

1468

Abstract

Audio from broadcast soccer can be used for identifying highlights from the game. Audio cues derived from these sources provide valuable information about game events, as can the detection of key words used by the commentators. In this paper we interpret the feasibility of incorporating both commentator word recognition and information about the additive background noise in an HMM structure. A limited set of audio cues, which have been extracted from data collected from the 2006 FIFA World Cup, are used to create an extension to the Aurora-2 database. The new database is then tested with various PMC models and compared to the standard baseline, clean and multi-condition training methods. It is found that incorporating SNR and noise type information into the PMC process is beneficial to recognition performance.

Keywords

audio databases; database indexing; hidden Markov models; speech recognition; 2006 FIFA World Cup; Aurora-2 database; HMM structure; PMC models; additive background noise; audio cues; audio indexing; broadcast soccer; commentator word recognition; game events; key word detection; parallel model combination; soccer audio; Background noise; Bridges; Databases; Hidden Markov models; Layout; Microphones; Noise level; Speech enhancement; Speech recognition; Testing; Audio indexing; HMM; soccer;

fLanguage

English

Publisher

ieee

Conference_Titel

Multimedia and Expo, 2008 IEEE International Conference on

Conference_Location

Hannover

Print_ISBN

978-1-4244-2570-9

Electronic_ISBN

978-1-4244-2571-6

Type

conf

DOI

10.1109/ICME.2008.4607722

Filename

4607722