DocumentCode
2418662
Title
A speech-video synchrony quality metric using CoIA
Author
Wei Yaodu ; Xie Xiang ; Kuang Jingming ; Han Xinlu
Author_Institution
Dept. of Electron. Eng., Beijing Inst. of Technol., Beijing, China
fYear
2010
fDate
13-14 Dec. 2010
Firstpage
173
Lastpage
177
Abstract
A quality model was built to assess the influence of speech-video asynchrony on the audio-visual quality perception. The audio-visual contents were separated into two categories: “speaker inside” and “speaker outside”, depending on whether the speaker is inside the video. For the first category, speech was shifted in a small scale. DCT and MFCC coefficients were calculated from video and speech separately. A Co-inertia Analysis (CoIA) was used to decide the speech-video correlation, and as the speech progressively shifts, a correlation curve emerged. The curve was modeled by an Gaussian function, and then the function was used to predict the perceptual quality. On the other hand, a Gaussian curve was used to predict the perceptual quality of the “speaker outside” category. A subjective test proved the effectiveness of the proposed method.
Keywords
Gaussian processes; audio-visual systems; correlation methods; discrete cosine transforms; speech processing; video signal processing; COIA; DCT coefficient; Gaussian function; MFCC coefficient; audio visual content; audio visual quality perception; coinertia analysis; correlation curve; speech video correlation; speech video synchrony quality; Correlation; Hidden Markov models; Mouth; Speech; Streaming media; Synchronization; Audio-visual quality; QVGA; asynchrony; co-inertia analysis; speech;
fLanguage
English
Publisher
ieee
Conference_Titel
Packet Video Workshop (PV), 2010 18th International
Conference_Location
Hong Kong
Print_ISBN
978-1-4244-9522-1
Electronic_ISBN
978-1-4244-9520-7
Type
conf
DOI
10.1109/PV.2010.5706835
Filename
5706835
Link To Document