DocumentCode :
3529095
Title :
WEB-derived pronunciations
Author :
Ghoshal, Arnab ; Jansche, Martin ; Khudanpur, Anjeev ; Riley, Michael ; Ulinski, Morgan
Author_Institution :
Johns Hopkins Univ., Baltimore, MD
fYear :
2009
fDate :
19-24 April 2009
Firstpage :
4289
Lastpage :
4292
Abstract :
Pronunciation information is available in large quantities on the Web, in the form of IPA and ad-hoc transcriptions. We describe techniques for extracting candidate pronunciations from Web pages and associating them with orthographic words, filtering out poorly extracted pronunciations, normalizing IPA pronunciations to better conform to a common transcription standard, and generating phonemic from ad-hoc transcriptions. We show improvements on a letter-to-phoneme task when using Web-derived vs. Pronlex pronunciations.
Keywords :
Internet; speech processing; IPA pronunciations; Pronlex pronunciations; Web-derived pronunciations; ad-hoc transcriptions; candidate pronunciation extraction; letter-to-phoneme task; orthographic words; Automatic speech recognition; Data mining; Decision trees; Hidden Markov models; Information filtering; Information filters; Law; Speech processing; Speech synthesis; Web pages; Speech processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1520-6149
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2009.4960577
Filename :
4960577
Link To Document :
بازگشت