• DocumentCode
    1300627
  • Title

    Returning Clustered Results for Keyword Search on XML Documents

  • Author

    Liu, Xiping ; Wan, Changxuan ; Chen, Lei

  • Author_Institution
    Sch. of Inf. Technol., Jiangxi Univ. of Finance & Econ., Nanchang, China
  • Volume
    23
  • Issue
    12
  • fYear
    2011
  • Firstpage
    1811
  • Lastpage
    1825
  • Abstract
    Keyword search is an effective paradigm for information discovery and has been introduced recently to query XML documents. In this paper, we address the problem of returning clustered results for keyword search on XML documents. We first propose a novel semantics for answers to an XML keyword query. The core of the semantics is the conceptually related relationship between keyword matches, which is based on the conceptual relationship between nodes in XML trees. Then, we propose a new clustering methodology for XML search results, which clusters results according to the way they match the given query. Two approaches to implement the methodology are discussed. The first approach is a conventional one which does clustering after search results are retrieved; the second one clusters search results actively, which has characteristics of clustering on the fly. The generated clusters are then organized into a cluster hierarchy with different granularities to enable users locate the results of interest easily and precisely. Experimental results demonstrate the meaningfulness of the proposed semantics as well as the efficiency of the proposed methods.
  • Keywords
    XML; query processing; XML document query; XML keyword query; XML trees; eXtensible Markup Language; information discovery; keyword search; Cloud computing; Clustering algorithms; Databases; Keyword search; Pattern matching; Search methods; Semantics; XML; information retrieval; XML keyword search; cluster hierarchy.; search results clustering;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2011.183
  • Filename
    5989812