DocumentCode :
469317
Title :
Tamil Document Summarization Using Semantic Graph Method
Author :
Banu, Mihai ; Karthika, C. ; Sudarmani, P. ; Geetha, T.V.
Author_Institution :
Anna Univ., Chennai
Volume :
2
fYear :
2007
fDate :
13-15 Dec. 2007
Firstpage :
128
Lastpage :
134
Abstract :
Document summarization refers to the task of producing shorter version of the original document by selecting important sentences from the text. Tamil Document Summarization using sub graph presents a method for extracting sentences from an individual document to serve as a document summary or a pre-cursor to creating a generic document abstract. Language-Neutral Syntax (LNS), a system of representation for natural language sentences has been used for considering the semantics of the documents. Syntactic analysis of the text that produces a logical form analysis has been applied for each sentence. Subject-Object-Predicate (SOP) triples are extracted from individual sentences to create a semantic graph [2] of the original document and the corresponding human extracted summary. Semantic Normalization is applied to SOP triples to reduce the number of nodes in the semantic graph of the original document. Using the Support Vector Machine (SVM) learning algorithm, a classifier has been trained to identify SOP triples from the document semantic graph that belong to the summary. The classifier is then used for automatic extraction of summaries from the test documents.
Keywords :
abstracting; classification; computational linguistics; graph theory; learning (artificial intelligence); natural language processing; support vector machines; text analysis; Tamil document summarization; classifier training; document automatic summary extraction; document semantic graph method; language-neutral syntax; logical form analysis; natural language sentence representation system; semantic normalization; subject-object-predicate triple; support vector machine learning algorithm; text syntactic analysis; Application software; Computational intelligence; Computer science; Data mining; Databases; Guidelines; Information analysis; Natural languages; Support vector machine classification; Support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Conference on Computational Intelligence and Multimedia Applications, 2007. International Conference on
Conference_Location :
Sivakasi, Tamil Nadu
Print_ISBN :
0-7695-3050-8
Type :
conf
DOI :
10.1109/ICCIMA.2007.247
Filename :
4426682
Link To Document :
بازگشت