DocumentCode :
2782750
Title :
A visual voice activity detection method with adaboosting
Author :
Qingju Liu ; Wenwu Wang ; Jackson, P.
Author_Institution :
Centre for Vision, Speech & Signal Process., Univ. of Surrey, Guildford, UK
fYear :
2011
fDate :
27-29 Sept. 2011
Firstpage :
1
Lastpage :
5
Abstract :
Spontaneous speech in videos capturing the speaker´s mouth provides bimodal information. Exploiting the relationship between the audio and visual streams, we propose a new visual voice activity detection (VAD) algorithm, to over-come the vulnerability of conventional audio VAD techniques in the presence of background interference. First, a novel lip extraction algorithm combining rotational templates and prior shape constraints with active contours is introduced. The visual features are then obtained from the extracted lip region. Second, with the audio voice activity vector used in training, adaboosting is applied to the visual features, to generate a strong final voice activity classifier by boosting a set of weak classifiers. We have tested our lip extraction algorithm on the XM2VTS database (with higher resolution) and some video clips from YouTube (with lower resolution). The visual VAD was shown to offer low error rates.
Keywords :
audio streaming; feature extraction; speech processing; video streaming; XM2VTS database; YouTube; active contours; adaboosting; audio VAD techniques; audio streams; audio voice activity vector; background interference; bimodal information; lip extraction algorithm; prior shape constraints; rotational templates; speaker mouth; video clips; visual VAD; visual streams; visual voice activity detection method; voice activity classifier;
fLanguage :
English
Publisher :
iet
Conference_Titel :
Sensor Signal Processing for Defence (SSPD 2011)
Conference_Location :
London
Electronic_ISBN :
978-1-84919-661-1
Type :
conf
DOI :
10.1049/ic.2011.0145
Filename :
6253401
Link To Document :
بازگشت