مرکز منطقه ای اطلاع رساني علوم و فناوري - Word clustering with parallel spoken language corpora

DocumentCode :

2267249

Title :

Word clustering with parallel spoken language corpora

Author :

Wang, Ye-Yi ; Lafferty, John ; Waibel, Alex

Author_Institution :

Carnegie Mellon Univ., Pittsburgh, PA, USA

Volume :

fYear :

1996

fDate :

3-6 Oct 1996

Firstpage :

2364

Abstract :

We introduce a word clustering algorithm which uses a bilingual, parallel corpus to group together words in the source and target language. Our method generalizes previous mutual information clustering algorithms for monolingual data by incorporating a statistical translation model. Preliminary experiments have shown that the algorithm can effectively employ the constraints implicit in bilingual data to extract classes which are well suited to machine translation tasks

Keywords :

language translation; natural languages; speech processing; statistical analysis; word processing; bilingual data; bilingual parallel corpus; machine translation tasks; monolingual data; mutual information clustering algorithms; parallel spoken language corpora; statistical translation model; word clustering algorithm; Books; Bridges; Clustering algorithms; Data mining; Entropy; Greedy algorithms; Merging; Mutual information; Natural languages; Scheduling;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on

Conference_Location :

Philadelphia, PA

Print_ISBN :

0-7803-3555-4

Type :

conf

DOI :

10.1109/ICSLP.1996.607283

Filename :

607283

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2267249