Title :
Introducing shadows: Flexible document representation and annotation on the Web
Author :
Mota, M.S. ; Medeiros, C.B.
Author_Institution :
Inst. of Comput., Univ. of Campinas (UNICAMP), Campinas, Brazil
Abstract :
The Web is witnessing an exponential growth of increasingly complex, distributed and heterogeneous documents. This hampers document exchange, as well as their annotation and retrieval. While information retrieval mechanisms concentrate on textual features (corpus analysis), annotation approaches either target specific formats or require that a document follows interoperable standards. This work presents our effort to handle these problems, providing a more flexible solution. Rather than trying to modify or convert the document itself, or to target only textual characteristics, the strategy described in this work is based on an intermediate descriptor - the document shadow. A shadow represents domain-relevant aspects and elements of both structure and content of a given document, as defined by a user group. Rather than annotating documents themselves, it is the shadows that are annotated, thereby providing independence between annotations and document formats. Our annotations take advantage of the LOD initiative. Via annotations users can derive correlations across shadows, in a flexible way. Moreover, shadows and annotations are stored in databases, therefore allowing uniform database treatments of heterogeneous documents.
Keywords :
Internet; content management; data structures; document handling; information retrieval; open systems; LOD initiative; Web annotation; annotations users; complex documents; database storage; distributed documents; document annotation; document conversion; document exchange; document shadow; domain-relevant aspects; flexible document representation; heterogeneous documents; information retrieval mechanisms; interoperable standards; textual characteristics; textual features; uniform database treatments; Biodiversity; Data mining; Databases; Feature extraction; Semantics; Standards; XML;
Conference_Titel :
Data Engineering Workshops (ICDEW), 2013 IEEE 29th International Conference on
Conference_Location :
Brisbane, QLD
Print_ISBN :
978-1-4673-5303-8
Electronic_ISBN :
978-1-4673-5302-1
DOI :
10.1109/ICDEW.2013.6547416