مرکز منطقه ای اطلاع رساني علوم و فناوري - A mouth full of words: Visually consistent acoustic redubbing

DocumentCode :

3431284

Title :

A mouth full of words: Visually consistent acoustic redubbing

Author :

Taylor, Sarah ; Theobald, Barry-John ; Matthews, Iain

Author_Institution :

Disney Res., Pittsburgh, PA, USA

fYear :

2015

fDate :

19-24 April 2015

Firstpage :

4904

Lastpage :

4908

Abstract :

This paper introduces a method for automatic redubbing of video that exploits the many-to-many mapping of phoneme sequences to lip movements modelled as dynamic visemes (Taylor et al., 2012). For a given utterance, the corresponding dynamic viseme sequence is sampled to construct a graph of possible phoneme sequences that synchronize with the video. When composed with a pronunciation dictionary and language model, this produces a vast number of word sequences that are in sync with the original video, literally putting plausible words into the mouth of the speaker. We demonstrate that traditional, many-to-one, static visemes lack flexibility for this application as they produce significantly fewer word sequences. This work explores the natural ambiguity in visual speech and offers insight for automatic speech recognition and the importance of language modeling.

Keywords :

audio-visual systems; speech processing; automatic speech recognition; automatic video redubbing; dynamic viseme sequence; language modeling; lip movements; many-to-one static visemes; natural ambiguity; phoneme sequences; pronunciation dictionary; visual speech; visually consistent acoustic redubbing; word sequences; Acoustics; Active appearance model; Dynamics; Speech; Speech recognition; Synchronization; Visualization; Audio-visual speech; acoustic redubbing; dynamic visemes;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location :

South Brisbane, QLD

Type :

conf

DOI :

10.1109/ICASSP.2015.7178903

Filename :

7178903

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3431284