مرکز منطقه ای اطلاع رساني علوم و فناوري - Speech technology for multimedia content management

DocumentCode :

2820470

Title :

Speech technology for multimedia content management

Author :

Nguyen, P. ; Kryze, D. ; Kuhn, R. ; Kobayashi, M. ; Yasukata, M.

Author_Institution :

Panasonic Speech Technol. Lab., Panasonic Technol. Co., Santa Barbara, CA, USA

fYear :

2004

fDate :

5-8 Jan. 2004

Firstpage :

376

Lastpage :

381

Abstract :

Multimedia content cannot be retrieved effectively unless metadata describing it is generated. However, metadata generation tends to be time-consuming and expensive, since it typically involves human beings going through the content and manually tagging it. The paper shows how automatic speech recognition (ASR) technology can be used to carry out metadata generation with significantly less expenditure of human effort. The paper describes two different approaches: voice tagging, whereby human beings tag the data but this process is speeded up by applying ASR to the tagging process; audio indexing, whereby much of the tagging process is automated by applying ASR to the content itself.

Keywords :

content-based retrieval; meta data; multimedia computing; speech recognition; ASR; audio indexing; automatic speech recognition; metadata generation; multimedia content management; speech technology; voice tagging; Automatic speech recognition; Content based retrieval; Content management; Educational institutions; Games; Humans; Indexing; Laboratories; Tagging; Watches;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Consumer Communications and Networking Conference, 2004. CCNC 2004. First IEEE

Conference_Location :

Las Vegas, NV, USA

Print_ISBN :

0-7803-8145-9

Type :

conf

DOI :

10.1109/CCNC.2004.1286891

Filename :

1286891

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2820470