Title :
Training a statistical surface realiser from automatic slot labelling
Author :
Cuayahuitl, Heriberto ; Dethlefs, Nina ; Hastie, Helen ; Xingkun Liu
Author_Institution :
Sch. of Math. & Comput. Sci., Heriot-Watt Univ., Edinburgh, UK
Abstract :
Training a statistical surface realiser typically relies on labelled training data or parallel data sets, such as corpora of paraphrases. The procedure for obtaining such data for new domains is not only time-consuming, but it also restricts the incorporation of new semantic slots during an interaction, i.e. using an online learning scenario for automatically extended domains. Here, we present an alternative approach to statistical surface realisation from unlabelled data through automatic semantic slot labelling. The essence of our algorithm is to cluster clauses based on a similarity function that combines lexical and semantic information. Annotations need to be reliable enough to be utilised within a spoken dialogue system. We compare different similarity functions and evaluate our surface realiser-trained from unlabelled data-in a human rating study. Results confirm that a surface realiser trained from automatic slot labels can lead to outputs of comparable quality to outputs trained from human-labelled inputs.
Keywords :
data handling; interactive systems; learning (artificial intelligence); pattern clustering; statistical analysis; automatic semantic slot labelling; clause clustering; dialogue system; lexical information; online learning scenario; semantic information; similarity function; statistical surface realiser; unlabelled data; Accuracy; Clustering algorithms; Labeling; Measurement; Semantics; Supervised learning; Training; dialogue systems; semantic slot labelling; surface realisation; unsupervised and supervised learning;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2014 IEEE
DOI :
10.1109/SLT.2014.7078559