Title :
Statistical semantic interpretation modeling for spoken language understanding with enriched semantic features
Author :
Celikyilmaz, A. ; Hakkani-Tur, Dilek ; Tur, Gokhan
Abstract :
In natural language human-machine statistical dialog systems, semantic interpretation is a key task typically performed following semantic parsing, and aims to extract canonical meaning representations of semantic components. In the literature, usually manually built rules are used for this task, even for implicitly mentioned non-named semantic components (like genre of a movie or price range of a restaurant). In this study, we present statistical methods for modeling interpretation, which can also benefit from semantic features extracted from large in-domain knowledge sources. We extract features from user utterances using a semantic parser and additional semantic features from textual sources (online reviews, synopses, etc.) using a novel tree clustering approach, to represent unstructured information that correspond to implicit semantic components related to targeted slots in the user´s utterances. We evaluate our models on a virtual personal assistance system and demonstrate that our interpreter is effective in that it does not only improve the utterance interpretation in spoken dialog systems (reducing the interpretation error rate by 36% relative compared to a language model baseline), but also unveils hidden semantic units that are otherwise nearly impossible to extract from purely manual lexical features that are typically used in utterance interpretation.
Keywords :
feature extraction; interactive systems; natural language processing; pattern clustering; speech recognition; speech synthesis; statistical analysis; canonical meaning representation extraction; implicit semantic components; lexical features; natural language human-machine statistical dialog systems; nonnamed semantic components; semantic feature extraction; semantic parsing; spoken dialog systems; spoken language understanding; statistical methods; statistical semantic interpretation modeling; textual sources; tree clustering approach; unstructured information representation; user utterances; utterance interpretation improvement; virtual personal assistance system; Data mining; Databases; Dictionaries; Engines; Feature extraction; Motion pictures; Semantics; graphical models; semantic interpretation; semi-supervised clustering; spoken language understanding;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2012 IEEE
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4673-5125-6
Electronic_ISBN :
978-1-4673-5124-9
DOI :
10.1109/SLT.2012.6424225