مرکز منطقه ای اطلاع رساني علوم و فناوري - Fast multimedia contents retrieval by partially spoken query

DocumentCode :

2646103

Title :

Fast multimedia contents retrieval by partially spoken query

Author :

Jeong, So-Young ; Han, Icksang ; Kwak, Byung-Kwan ; Cho, Jeongmi ; Kim, Jeongsu

fYear :

2011

fDate :

9-12 Jan. 2011

Firstpage :

839

Lastpage :

840

Abstract :

We present novel fast multi-pass decoding strategies for recognizing large named-entities on a low-resource embedded device and thus retrieving MP3 music using spoken query, which contains partial segments of whole music titles and artists. After acoustic-phonetic decoding in the first stage processing, we incorporate word boundary information with phonetic confusion matrix into next stage partial word matching. Then, we rescore candidate phone lists using more complex context-dependent acoustic model, whose outputs are the retrieved songs. We tested our retrieval system to the task of retrieving 1000 songs on a commercial MP3 player and could achieve about 15.5% relative improvements in response time over conventional frame-based multi-pass decoding method without sacrificing recognition rates.

Keywords :

content-based retrieval; music; speech processing; speech recognition; MP3 music player; acoustic-phonetic decoding; context-dependent acoustic model; fast multimedia contents retrieval; low-resource embedded device; multipass decoding strategy; next stage partial word matching; phone lists; phonetic confusion matrix; spoken query; Acoustics; Context modeling; Decoding; Digital audio players; Speech; Speech recognition; Time factors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Consumer Electronics (ICCE), 2011 IEEE International Conference on

Conference_Location :

Las Vegas, NV

ISSN :

2158-3994

Print_ISBN :

978-1-4244-8711-0

Type :

conf

DOI :

10.1109/ICCE.2011.5722893

Filename :

5722893

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2646103