DocumentCode :
2646103
Title :
Fast multimedia contents retrieval by partially spoken query
Author :
Jeong, So-Young ; Han, Icksang ; Kwak, Byung-Kwan ; Cho, Jeongmi ; Kim, Jeongsu
fYear :
2011
fDate :
9-12 Jan. 2011
Firstpage :
839
Lastpage :
840
Abstract :
We present novel fast multi-pass decoding strategies for recognizing large named-entities on a low-resource embedded device and thus retrieving MP3 music using spoken query, which contains partial segments of whole music titles and artists. After acoustic-phonetic decoding in the first stage processing, we incorporate word boundary information with phonetic confusion matrix into next stage partial word matching. Then, we rescore candidate phone lists using more complex context-dependent acoustic model, whose outputs are the retrieved songs. We tested our retrieval system to the task of retrieving 1000 songs on a commercial MP3 player and could achieve about 15.5% relative improvements in response time over conventional frame-based multi-pass decoding method without sacrificing recognition rates.
Keywords :
content-based retrieval; music; speech processing; speech recognition; MP3 music player; acoustic-phonetic decoding; context-dependent acoustic model; fast multimedia contents retrieval; low-resource embedded device; multipass decoding strategy; next stage partial word matching; phone lists; phonetic confusion matrix; spoken query; Acoustics; Context modeling; Decoding; Digital audio players; Speech; Speech recognition; Time factors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Consumer Electronics (ICCE), 2011 IEEE International Conference on
Conference_Location :
Las Vegas, NV
ISSN :
2158-3994
Print_ISBN :
978-1-4244-8711-0
Type :
conf
DOI :
10.1109/ICCE.2011.5722893
Filename :
5722893
Link To Document :
بازگشت