Title :
A Blocking Scheme for Identification of Components and Sub-Components of Semi-Structured E-Documents
Author_Institution :
Armstrong Atlantic State Univ., Savannah
Abstract :
In order to convert a semi-structured e-document into an XML e-document, the physical components and subcomponents of the e-document must be tagged. Such tagging is possible once the components and subcomponents of the e-document are identified. A methodology, blocking scheme, is devised and used for this research effort to identify semi-structured e-documents´ components and sub-components. The methodology was applied on 50 semi-structured e- documents of a specific type and the identification of their components and sub-components were achieved with 100% accuracy.
Keywords :
XML; document handling; XML; blocking scheme; semistructured e-documents; Content based retrieval; Drugs; Information technology; Skin; Spatial databases; Strips; Tagging; Telephony; Text analysis; XML; Blocking Scheme; Layout Analysis; Overt and Covert Semi-structured Documents; Strip; Strip Block; XML; and Folded Strip.;
Conference_Titel :
Information Technology: New Generations, 2008. ITNG 2008. Fifth International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
0-7695-3099-0
DOI :
10.1109/ITNG.2008.40