مرکز منطقه ای اطلاع رساني علوم و فناوري - Context and Keyword Extraction in Plain Text Using a Graph Representation

DocumentCode :

2027777

Title :

Context and Keyword Extraction in Plain Text Using a Graph Representation

Author :

Chahine, C. Abi ; Chaignaud, N. ; Kotowicz, JPh ; Pecuchet, Jean-Pierre

Author_Institution :

INSA Rouen, Mont-Saint-Aignan, France

fYear :

2008

fDate :

Nov. 30 2008-Dec. 3 2008

Firstpage :

692

Lastpage :

696

Abstract :

Document indexation is an essential task achieved by archivists or automatic indexing tools. To retrieve relevant documents to a query, keywords describing this document have to be carefully chosen. Archivists have to find out the right topic of a document before starting to extract the keywords. For an archivist indexing specialized documents, experience plays an important role. But indexing documents on different topics is much harder. This article proposes an innovative method for an indexing support system. This system takes as input an ontology and a plain text document and provides as output contextualized keywords of the document. The method has been evaluated by exploiting Wikipedia´s category links as a termino-ontological resources.

Keywords :

data structures; document handling; indexing; Wikipedia category links; archivists; automatic indexing tools; context extraction; document indexation; graph representation; indexing support system; keyword extraction; plain text; termino-ontological resources; Data mining; Databases; Filling; Frequency; Hidden Markov models; Internet; Machine assisted indexing; Mathematics; Ontologies; Wikipedia; Graph; Knowledge representation; Web semantic;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal Image Technology and Internet Based Systems, 2008. SITIS '08. IEEE International Conference on

Conference_Location :

Bali

Print_ISBN :

978-0-7695-3493-0

Type :

conf

DOI :

10.1109/SITIS.2008.47

Filename :

4725873

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2027777