Title :
Sentence generation from a bag of words using N-gram model
Author :
Yadav, Arun Kumar ; Borgohain, Samir Kumar
Author_Institution :
Dept. of Comput. Sci. & Eng., Nat. Inst. of Technol. Silchar, Silchar, India
Abstract :
We are presenting in this paper, a method of sentence generation from a given bag of words. The task of sentence generation has its usage in text summarization, question answering system etc. The focus of our task is to generate all possible correct sentences from a given bag of words. The technique that we have applied is N-gram language model. The N-gram model is trained by a text corpus to generate only candidate sequences from a given bag of words. For N input words, instead of considering all possible N! permuted orders as candidate sequence, we have generated only candidate sequences less then N! by applying DFS (Depth First Search) filtering technique at run time. We have two corpora namely text corpus and annotated corpus of POS tags. We have extracted all valid POS trigram tags from the annotated corpus. Each of the generated candidate sequence has a probability score. The candidate sequences were ranked by matching it with valid trigram POS tag signature and probability score. Preliminary experimental work carried out in this direction by using the above mentioned model shows promising results.
Keywords :
computational linguistics; natural language processing; probability; speech processing; text analysis; tree searching; DFS filtering technique; POS trigram tag extraction; annotated corpus; bag-of-words; correct-sentence generation method; depth-first search filtering technique; n-gram language model; n-input words; probability score; run time analysis; sequence generation; sequence matching; sequence ranking; text corpus; trigram POS tag signature; Depth First Search; N-gram Language Model; Part of Speech Tagging; Sentence Generation; Syntax;
Conference_Titel :
Advanced Communication Control and Computing Technologies (ICACCCT), 2014 International Conference on
Conference_Location :
Ramanathapuram
Print_ISBN :
978-1-4799-3913-8
DOI :
10.1109/ICACCCT.2014.7019414