مرکز منطقه ای اطلاع رساني علوم و فناوري - Multi-scale-audio indexing for translingual spoken document retrieval

DocumentCode :

1749724

Title :

Multi-scale-audio indexing for translingual spoken document retrieval

Author :

Wang, Hsin-Min ; Meng, Helen ; Schone, Patrick ; Chen, Berlin ; Lo, Wai-Kit

Author_Institution :

Inst. of Inf. Sci., Acad. Sinica, Taipei, Taiwan

Volume :

fYear :

2001

fDate :

2001

Firstpage :

605

Abstract :

MEI (Mandarin-English Information) is an English-Chinese crosslingual spoken document retrieval (CL-SDR) system developed during the Johns Hopkins University Summer Workshop 2000. We integrate speech recognition, machine translation, and information retrieval technologies to perform CL-SDR. MEI advocates a multi-scale paradigm, where both Chinese words and subwords (characters and syllables) are used in retrieval. The use of subword units can complement the word unit in handling the problems of Chinese word tokenization ambiguity, Chinese homophone ambiguity, and out-of-vocabulary words in audio indexing. This paper focuses on multi-scale audio indexing in MEI. Experiments are based on the Topic Detection and Tracking Corpora (TDT-2 and TDT-3), where we indexed Voice of America Mandarin news broadcasts by speech recognition on both the word and subword scales. We discuss the development of the MEI syllable recognizer, the representations of spoken documents using overlapping subword n-grams and lattice structures. Results show that augmenting words with subwords is beneficial to CL-SDR performance

Keywords :

audio signal processing; database indexing; grammars; information retrieval; language translation; signal representation; speech recognition; Chinese characters; Chinese homophone ambiguity; Chinese subwords; Chinese syllables; Chinese word token ambiguity; Chinese words; English-Chinese crosslingual spoken document retrieval; Johns Hopkins University; Mandarin news broadcasts; Mandarin-English Information system; TDT-2; TDT-3; Topic Detection and Tracking Corpora; Voice of America; information retrieval; lattice structures; machine translation; multi-scale audio indexing; multi-scale paradigm; out-of-vocabulary words; speech recognition; spoken document representation; subword n-grams; syllable recognizer; Error analysis; Gold; Indexing; Information retrieval; Information science; Natural languages; Radio broadcasting; Speech recognition; Systems engineering and theory; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on

Conference_Location :

Salt Lake City, UT

ISSN :

1520-6149

Print_ISBN :

0-7803-7041-4

Type :

conf

DOI :

10.1109/ICASSP.2001.940904

Filename :

940904

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1749724