Title :
Ontology Based Structured Representation for Domain Specific Unstructured Documents
Author :
Shashirekha, H.L. ; Murali, S.
Author_Institution :
P.E.S. Coll. of Eng., Mandya
Abstract :
Extracting information from unstructured, brief and short text composed of short phrases, incomplete sentences, unordered sequence of words and words in short form not falling into any regular syntax is a challenging task. This paper describes an approach to automatically extract information from data rich unstructured text documents based on a domain dependent ontology and populate a database. Here, we apply pattern matching in terms of keywords/constants to extract the patterns and generate a structured text representation with respect to a domain specific ontology. The approach is illustrated on one such unstructured, short and brief text -classified matrimonial advertisement. The performance analysis of the approach on this case study is presented.
Keywords :
data structures; document handling; information retrieval; ontologies (artificial intelligence); text analysis; classified matrimonial advertisement; domain dependent ontology; domain specific unstructured documents; information extraction; ontology based structured representation; pattern matching; structured text representation; text documents; Computational intelligence; Data mining; Databases; Educational institutions; IEEE news; Information retrieval; Natural languages; Ontologies; Pattern matching; Text recognition;
Conference_Titel :
Conference on Computational Intelligence and Multimedia Applications, 2007. International Conference on
Conference_Location :
Sivakasi, Tamil Nadu
Print_ISBN :
0-7695-3050-8
DOI :
10.1109/ICCIMA.2007.255