XML keyword search based on maximum repetitive unit

Author

Wang, Desheng ; Liu, Guiquan ; Luo, Qiming ; Chen, Enhong

Author_Institution

Dept. of Comput. Sci. & Technol., Univ. of Sci. & Technol. of China, Hefei, China

fYear

2011

fDate

27-29 June 2011

Firstpage

1891

Lastpage

1895

Abstract

As XML becomes the standard for data representation and exchange, effective and efficient methods for XML data retrieval have become increasingly important. In practice, XML documents tend to have a shallow and wide structure, and contain a large number of duplicate units. According to these characteristics, we propose a novel algorithm for keyword search in XML documents based on maximum repetitive unit. The basic idea of the algorithm is as follows. Firstly, extract the duplicate structures of XML documents as repetitive units. Then find out which units contain all the query keywords. The results returned are a number of repetitive units associated with the query. The experiments show that the algorithm is scalable, efficient and able to obtain query results with good semantic integrity.

Keywords

XML; data structures; electronic data interchange; query processing; XML data retrieval; XML documents; XML keyword search; data exchange; data representation; maximum repetitive unit; query keyword; semantic integrity; Algorithm design and analysis; HTML; Keyword search; Knowledge engineering; Semantics; XML; XML keyword retrieval; repetitive unit; smallest lowest common ancestor;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Science and Service System (CSSS), 2011 International Conference on

Conference_Location

Nanjing

Print_ISBN

978-1-4244-9762-1

Type

conf

DOI

10.1109/CSSS.2011.5974836

Filename

5974836