DocumentCode :
2581455
Title :
Advanced technology for managing XML document collection
Author :
Hsiao, Hui-I
Author_Institution :
IBM Almaden Res. Center, San Jose, CA, USA
fYear :
2005
fDate :
15-16 Aug. 2005
Abstract :
Organizing large document collections for finding information easily and quickly has always been a challenging problem. In the last few years, XML has become the de-facto standard for content publishing and data exchange. The proliferation of XML documents and data has created new challenges and opportunities for managing document collections. Existing technologies for automatically organizing document collections are either imprecise or based only on simple grouping criteria. Since XML documents are self describing, it is possible to automatically categorize XML documents precisely, according to their content. With the availability of the standard XML query languages, e.g. XQuery, much more powerful folder and categorization technologies are now feasible. To address this new challenge and exploit this new opportunity, this paper describes a new and powerful categorization technology. This technology fully exploits the rich data model and semantic information embedded in the XML documents to dynamically categorize XML document collections precisely. Besides supporting directory-like document look-up operations, this technology also provides advanced operations such as multi-path navigation and document traversal across multiple collections. A preliminary performance study shows that this new categorization technology is both efficient and scalable. Thus, it is an ideal technology for automating the process of organizing and categorizing XML documents.
Keywords :
XML; data models; XML document collection; XML query language; XQuery; content publishing; data exchange; data model; directory-like document look-up operation; document categorization technology; multipath navigation; semantic information; Chaos; Data models; Database languages; Navigation; Organizing; Publishing; Standards publication; Technology management; Web pages; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Emerging Information Technology Conference, 2005.
Print_ISBN :
0-7803-9328-7
Type :
conf
DOI :
10.1109/EITC.2005.1544381
Filename :
1544381
Link To Document :
بازگشت