مرکز منطقه ای اطلاع رساني علوم و فناوري - Tamil Document Summarization Using Semantic Graph Method

DocumentCode :

469317

Title :

Tamil Document Summarization Using Semantic Graph Method

Author :

Banu, Mihai ; Karthika, C. ; Sudarmani, P. ; Geetha, T.V.

Author_Institution :

Anna Univ., Chennai

Volume :

fYear :

2007

fDate :

13-15 Dec. 2007

Firstpage :

128

Lastpage :

134

Abstract :

Document summarization refers to the task of producing shorter version of the original document by selecting important sentences from the text. Tamil Document Summarization using sub graph presents a method for extracting sentences from an individual document to serve as a document summary or a pre-cursor to creating a generic document abstract. Language-Neutral Syntax (LNS), a system of representation for natural language sentences has been used for considering the semantics of the documents. Syntactic analysis of the text that produces a logical form analysis has been applied for each sentence. Subject-Object-Predicate (SOP) triples are extracted from individual sentences to create a semantic graph [2] of the original document and the corresponding human extracted summary. Semantic Normalization is applied to SOP triples to reduce the number of nodes in the semantic graph of the original document. Using the Support Vector Machine (SVM) learning algorithm, a classifier has been trained to identify SOP triples from the document semantic graph that belong to the summary. The classifier is then used for automatic extraction of summaries from the test documents.

Keywords :

abstracting; classification; computational linguistics; graph theory; learning (artificial intelligence); natural language processing; support vector machines; text analysis; Tamil document summarization; classifier training; document automatic summary extraction; document semantic graph method; language-neutral syntax; logical form analysis; natural language sentence representation system; semantic normalization; subject-object-predicate triple; support vector machine learning algorithm; text syntactic analysis; Application software; Computational intelligence; Computer science; Data mining; Databases; Guidelines; Information analysis; Natural languages; Support vector machine classification; Support vector machines;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Conference on Computational Intelligence and Multimedia Applications, 2007. International Conference on

Conference_Location :

Sivakasi, Tamil Nadu

Print_ISBN :

0-7695-3050-8

Type :

conf

DOI :

10.1109/ICCIMA.2007.247

Filename :

4426682

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=469317