Title :
Document Summarization and Information Extraction for Generation of Presentation Slides
Author :
Gokul Prasad, K. ; Mathivanan, Harish ; Jayaprakasam, Madan ; Geetha, T.V.
Author_Institution :
Dept. of Comput. Sci. & Eng., Anna Univ., Chennai, India
Abstract :
In this paper, a semi automated technique to generate slide presentations from english text documents is proposed. The technique discussed in this paper is considered to be a pioneering attempt in the field of NLP (Natural Language Processing). The technique involves an information extractor and a slide generator, which combines certain NLP methods such as segmentation, chunking, summarization etc.., with certain special linguistic features of the text such as the ontology of the words, noun phrases found, semantic links, sentence centrality etc., In order to aid the language processing task, two tools can be utilized namely, MontyLingua which helps in chunking and Doddle helps in creating an ontology for the input text represented as an OWL (Ontology Web Language) file. The process of the technique comprises of extracting text, creating an ontology, identifying important phrases for bullets and generating slides.
Keywords :
information retrieval; knowledge representation languages; natural language processing; text analysis; Doddle; MontyLingua; NLP method; OWL file; Ontology Web Language; document summarization; english text document; information extraction; natural language processing; ontology creation; presentation slide generation; text extraction; Aggregates; Communications technology; Computer science; Data mining; Explosions; Natural languages; OWL; Ontologies; Pattern matching; Seminars; chunking; information extraction; ontology; segmentation; summarization;
Conference_Titel :
Advances in Recent Technologies in Communication and Computing, 2009. ARTCom '09. International Conference on
Conference_Location :
Kottayam, Kerala
Print_ISBN :
978-1-4244-5104-3
DOI :
10.1109/ARTCom.2009.74