DocumentCode :
2822122
Title :
Automatic Extraction of Spoken Word in Broadcast Media Language
Author :
Zhang, Yuqiang ; Zou, Yu ; He, Wei ; Hou, Min ; Teng, Yonglin
Author_Institution :
Broadcast Media Language Res. Center, Commun. Univ. of China, Beijing, China
Volume :
2
fYear :
2009
fDate :
24-26 April 2009
Firstpage :
403
Lastpage :
405
Abstract :
Compared with the written word, few experts pay more attention to the spoken word because of the difficulty of obtaining spoken corpora. In order to develop and improve the spoken words research, this paper proposes a novel method for automatic extraction spoken words in broadcasting language, and the result is impressive. From analysis of the result, we extracted 3009 spoken words by the model on word usage frequency of spatial distribution, and obtain a correct extraction rate over 85% in part I data and 76.5% in part II respectively. The word usage frequency of spatial distribution model can effectively extract and distinguish the spoken words from broadcast media language.
Keywords :
information retrieval; speech processing; word processing; automatic spoken word extraction; broadcast media language; spatial distribution model; spoken corpora; word usage frequency; Data mining; Frequency; Helium; Large-scale systems; Logistics; Natural languages; Radio broadcasting; Speech; TV broadcasting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Sciences and Optimization, 2009. CSO 2009. International Joint Conference on
Conference_Location :
Sanya, Hainan
Print_ISBN :
978-0-7695-3605-7
Type :
conf
DOI :
10.1109/CSO.2009.82
Filename :
5193982
Link To Document :
بازگشت