DocumentCode :
1518709
Title :
Visual Sentences for Pose Retrieval Over Low-Resolution Cross-Media Dance Collections
Author :
Ren, Reede ; Collomosse, John
Author_Institution :
Dept. of Comput. Sci., Univ. of Glasgow, Glasgow, UK
Volume :
14
Issue :
6
fYear :
2012
Firstpage :
1652
Lastpage :
1661
Abstract :
We describe a system for matching human posture (pose) across a large cross-media archive of dance footage spanning nearly 100 years, comprising digitized photographs and videos of rehearsals and performances. This footage presents unique challenges due to its age, quality and diversity. We propose a forest-like pose representation combining visual structure (self-similarity) descriptors over multiple scales, without explicitly detecting limb positions which would be infeasible for our data. We explore two complementary multi-scale representations, applying passage retrieval and latent Dirichlet allocation (LDA) techniques inspired by the text retrieval domain, to the problem of pose matching. The result is a robust system capable of quickly searching large cross-media collections for similarity to a visually specified query pose. We evaluate over a cross-section of the UK National Research Centre for Dance´s (UK-NRCD), and the Siobhan Davies Replay´s (SDR) digital dance archives, using visual queries supplied by dance professionals. We demonstrate significant performance improvements over two base-lines: classical single and multi-scale bag of visual words (BoVW) and spatial pyramid kernel (SPK) matching .
Keywords :
humanities; image matching; image representation; image resolution; image retrieval; pose estimation; statistics; text analysis; video signal processing; BoVW; LDA technique; SDR digital dance archives; SPK matching; Siobhan Davies Replay digital dance archives; UK National Research Centre for Dance; UK-NRCD; classical single bag of visual words; complementary multiscale representations; cross-media archive; dance footage; digitized photographs; forest-like pose representation; human posture matching; latent Dirichlet allocation technique; low-resolution cross-media dance collection; multiscale bag of visual words; passage retrieval; performance video; pose matching; pose retrieval; query pose; rehearsal video; spatial pyramid kernel matching; text retrieval domain; visual query; visual sentences; visual structure descriptor; Estimation; Image segmentation; Joints; Materials; Shape; Videos; Visualization; Content based image retrieval; dance archives; low-resolution pose similarity;
fLanguage :
English
Journal_Title :
Multimedia, IEEE Transactions on
Publisher :
ieee
ISSN :
1520-9210
Type :
jour
DOI :
10.1109/TMM.2012.2199971
Filename :
6202345
Link To Document :
بازگشت