DocumentCode :
727795
Title :
Towards a hybrid NLG system for Data2Text in Portuguese
Author :
Pereira, Jose Casimiro ; Teixeira, Antonio ; Sousa Pinto, Joaquim
Author_Institution :
Inst. Politec. Tomar, Tomar, Portugal
fYear :
2015
fDate :
17-20 June 2015
Firstpage :
1
Lastpage :
6
Abstract :
In many new interactions with machines, such as dialogue or output using voice, there is the need to convert information internal to a system into sentences, using Data2Text systems. Trying to avoid the limitations of template-based and classical NLG methods, systems based on automatic translation have been proposed in recent years. Despite providing sentences with the important variability needed for a better interaction, this doesn´t come without a cost. Contrary to template-based, these systems produce sentences with heterogeneous quality. In this paper we proposed to combine a translation based NLG system with a classifier module capable of providing information on the Intelligibility or Quality of the sentences. Sentences marked as unacceptable are replaced by template-based generated ones. This classifier module is the main focus of the paper and combines extraction of linguistic features with a classifier trained in a manually annotated corpus. Results suggest that our approach is valid as best results obtained have false positives below 8% and this metric can be even lower in practical applications, decreasing to around 3%, as the generation module produces low quality sentences at a rate lower than 30%.
Keywords :
natural language processing; text analysis; Data2Text systems; Portuguese; automatic translation; classical NLG methods; classifier module; hybrid NLG system; information internal; linguistic features; quality of the sentences; Feature extraction; Measurement; Natural languages; Pragmatics; Radio frequency; Support vector machines; Vegetation; Data2Text; Natural Language Generation (NLG); Portuguese; sentences quality evaluation; translation based NLG;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Systems and Technologies (CISTI), 2015 10th Iberian Conference on
Conference_Location :
Aveiro
Type :
conf
DOI :
10.1109/CISTI.2015.7170419
Filename :
7170419
Link To Document :
بازگشت