مرکز منطقه ای اطلاع رساني علوم و فناوري - A study on sports video classification based on audio analysis and speech recognition

DocumentCode :

2022687

Title :

A study on sports video classification based on audio analysis and speech recognition

Author :

Lu, Li ; Zhao, Qingwei ; Yan, Yonghong ; Liu, Kun

Author_Institution :

Think IT Speech Lab., Chinese Acad. of Sci., Beijing, China

fYear :

2010

fDate :

23-25 Nov. 2010

Firstpage :

737

Lastpage :

742

Abstract :

This paper proposes a method to deal with the problem of sports classification through audio analysis. First, a two-pass audio segmentation module is developed as the front-end to extract announcer´s speech from the audio streams. Then speech recognition technology is employed on the speech segments to extract keywords which are used as features to distinguish different sports. Finally, based on the keyword spotting (KWS) results and specific keywords selected for each kind of sports, a score ranking strategy is designed for conducting classification automatically. For robust KWS in our system, adaptation techniques for acoustic model and language model are employed and both of them show significant improvements on the KWS performance. Fifteen games of seven kinds of sports are used to evaluate the system performance. By integrating all the techniques, an average figure of metric (FOM) of 70.74 is achieved on the KWS task, a 100% accuracy rate is achieved on sports classification task using all detected keywords of each game.

Keywords :

acoustic signal processing; audio streaming; feature extraction; image classification; speech recognition; sport; video signal processing; KWS performance; acoustic model; audio analysis; audio stream; figure-of-metric; keyword spotting; language model; score ranking; speech extraction; speech recognition; speech segment; sports video classification; two-pass audio segmentation; Acoustics; Adaptation model; Feature extraction; Games; Hidden Markov models; Speech; Speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Audio Language and Image Processing (ICALIP), 2010 International Conference on

Conference_Location :

Shanghai

Print_ISBN :

978-1-4244-5856-1

Type :

conf

DOI :

10.1109/ICALIP.2010.5685074

Filename :

5685074

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2022687