• DocumentCode
    501211
  • Title

    A Distributed Parallel Algorithm for Web Page Inverted Indexes Construction on the Cluster Computing Systems

  • Author

    Zhengyou, Liang ; Tao, Chen

  • Author_Institution
    Dept. of Comput., Electron. & Inf., Guangxi Univ., Manning, China
  • Volume
    2
  • fYear
    2009
  • fDate
    15-17 May 2009
  • Firstpage
    33
  • Lastpage
    36
  • Abstract
    Against the low index speed of serial algorithm for Web page inverted indexes construction, according to a characteristic of merge-sort algorithm meets the theory of scheduling divisible loads in parallel and distributed system, the paper proposed a new parallel algorithm basing on the triple sort-merge for Web page inverted indexes construction. The algorithm distributed parallel dealt with the two tasks parsing term and sorting these term postings which spent lots of time in the construction of inverted indexes, each term was represented as a triple, the time complexity of the algorithm was analyzed. This paper also applied a Java middleware named ProActive, designed and implemented a distributive parallel Web page indexer named P_Indexer on the cluster computing systems. The algorithm analysis and experimental results showed the parallel algorithm reaches high efficiency and good scalability.
  • Keywords
    Internet; Java; computational complexity; middleware; parallel algorithms; Java middleware; P Indexer; ProActive; Web page inverted indexes construction; cluster computing systems; distributed parallel algorithm; merge-sort algorithm; scheduling theory; time complexity; Algorithm design and analysis; Clustering algorithms; Concurrent computing; Distributed computing; Java; Parallel algorithms; Processor scheduling; Scheduling algorithm; Sorting; Web pages; ProActive middleware; Web page indexer; distributed parallel; inverted indexes; text search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology and Applications, 2009. IFITA '09. International Forum on
  • Conference_Location
    Chengdu
  • Print_ISBN
    978-0-7695-3600-2
  • Type

    conf

  • DOI
    10.1109/IFITA.2009.553
  • Filename
    5231316