• DocumentCode
    3028182
  • Title

    TIIS: A two-level inverted-index scheme for large-scale data processing in the parallel database system

  • Author

    Yu Lei ; Ge Fu ; Huaiyuan Tan ; Yan Jin ; Hong Zhang ; Xinran Liu ; Xiaojia Xiang

  • Author_Institution
    Inst. of Comput. Technol., Beijing, China
  • fYear
    2013
  • fDate
    20-22 Dec. 2013
  • Firstpage
    2540
  • Lastpage
    2547
  • Abstract
    Based on Service-Oriented Architecture, an inexpensive solution, Parallel database middleware gather the standalone database instance to provide users with highly scalable relational data management platform. However, with the advent of the era of large-scale data, such platform has posed a serious challenge in the context of text data retrieval. Motivated by this observation, a parallel database middleware based on semi-structure data is firstly designed to support text retrieval. Then, a two-level inverted-index scheme called TIIS is designed for full-text query. The advantages of TIIS are that it can quickly locate the result data from large cluster distributed database storing large-scale data, and it can greatly reduce the network I/O and disk I/O. Experimental results show that, comparing with Hive using Hadoop Distributed File System in same environment of hardware, our system performs typical TPC-H data analysis, consuming of full-text query is declined by 90% on 2GB commercial data in average.
  • Keywords
    data analysis; database indexing; middleware; parallel databases; query processing; text analysis; Hadoop distributed file system; Hive; TIIS; TPC-H data analysis; disk I/O reduction; full-text query; large cluster distributed database; large-scale data processing; network I/O reduction; parallel database middleware; parallel database system; semistructured data; text retrieval; two-level inverted-index scheme; Distributed databases; Indexes; Middleware; Query processing; Text analysis; full text retrieval; inverted index; parallel database; query optimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Mechatronic Sciences, Electric Engineering and Computer (MEC), Proceedings 2013 International Conference on
  • Conference_Location
    Shengyang
  • Print_ISBN
    978-1-4799-2564-3
  • Type

    conf

  • DOI
    10.1109/MEC.2013.6885464
  • Filename
    6885464