DocumentCode :
2938964
Title :
Gaze-Driven video streaming with saliency-based dual-stream switching
Author :
Yunlong Feng ; Gene Cheung ; Wai-tian Tan ; Yusheng Ji
fYear :
2012
fDate :
27-30 Nov. 2012
Firstpage :
1
Lastpage :
6
Abstract :
The ability of a person to perceive image details falls precipitously with larger angle away from his visual focus. At any given bitrate, perceived visual quality can be improved by employing region-of-interest (ROI) coding, where higher encoding quality is judiciously applied only to regions close to a viewer´s focal point. Straight-forward matching of viewer´s focal point with ROI coding using a live encoder, however, is computation-intensive. In this paper, we propose a system that supports ROI coding without the need of a live encoder. The system is based on dynamic switching between two pre-encoded streams of the same content: one at high quality (HQ), and the other at mixed quality (MQ), where quality of a spatial region depends on its pre-computed visual saliency values. Distributed source coding (DSC) frames are periodically inserted to facilitate switching. Using a Hidden Markov Model (HMM) to model a viewer´s temporal gaze movement, MQ stream is pre-encoded based on ROI coding to minimize the expected streaming rate, while keeping the probability of a viewer observing low quality (LQ) spatial regions below an application-specific ϵ. At stream time, the viewer´s gaze locations are collected and transmitted to server for intelligent stream switching. In particular, server employs MQ stream only if: i) viewer´s tracked gaze location falls inside the high-saliency regions, and ii) the probability that a viewer´s gaze point will soon move outside high-saliency regions, computed using tracked gaze data and updated saliency values, is below ϵ. Experiments showed that video streaming rate can be reduced by up to 44%, and subjective quality is noticeably better than a competing scheme at the same rate where the entire video is encoded using equal quantization.
Keywords :
hidden Markov models; precoding; quantisation (signal); source coding; video coding; video streaming; DSC frames; HMM; HQ stream; MQ stream; ROI coding; distributed source coding; dynamic switching; encoding quality; equal quantization; gaze-driven video streaming; hidden Markov model; high-quality stream; high-saliency regions; intelligent stream switching; live encoder; mixed quality stream; perceived visual quality; pre-computed visual saliency values; pre-encoded streams; region-of-interest coding; saliency-based dual-stream switching; straight-forward matching; streaming rate minimization; temporal gaze movement; tracked gaze data; updated saliency values; viewer focal point; viewer gaze location; viewer observing low-quality spatial region probability; visual focus; Encoding; Equations; Hidden Markov models; Servers; Streaming media; Switches; Visualization; Region-of-Interest encoding; video streaming; visual saliency;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Visual Communications and Image Processing (VCIP), 2012 IEEE
Conference_Location :
San Diego, CA
Print_ISBN :
978-1-4673-4405-0
Electronic_ISBN :
978-1-4673-4406-7
Type :
conf
DOI :
10.1109/VCIP.2012.6410732
Filename :
6410732
Link To Document :
بازگشت