Title :
Improved named entity extraction from conversational speech with language model adaptation
Author :
Siu, Man-Hung ; Vessenes, Ted ; Bulyko, Ivan ; Kimball, Owen
Author_Institution :
Raytheon BBN Technol., Cambridge, MA, USA
Abstract :
Named entity (NE) extraction is traditionally applied to written text. Some recent works extend the extraction to broadcast speech but most of these approaches simply cascade a speech-to-text (STT) engine and a named entity tagger without any information sharing between the two. In this paper, we extract named entities from conversational speech and explore approaches to couple the STT and NE extraction beyond a simple cascade. We propose a new approach that adapts the STT language model based on the extracted named entities. This steers the STT to focus on the subset of the vocabulary that is most important to NE extraction. We performed a number of experiments on English conversational speech in the Fisher Corpus with different STT recognition speeds. We show that the language model (LM) adaptation approach increases NE extraction recall and improves NE performance by as much as 15% as measured by F-score, by significantly improving the word error rate (WER) of NE words, with minimal impact on the overall WER.
Keywords :
speech synthesis; English conversational speech; broadcast speech extraction; language model adaptation; language model adaptation approach; named entity extraction; speech-to-text engine; Named entity extraction; adaptation; language model;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2010 IEEE
Conference_Location :
Berkeley, CA
Print_ISBN :
978-1-4244-7904-7
Electronic_ISBN :
978-1-4244-7902-3
DOI :
10.1109/SLT.2010.5700889