Title :
Parallel rank aggregation for theWorld Wide Web
Author :
Beg, M. M Sufyan
Author_Institution :
Dept. of Comput. Eng., AMU, Aligarh, India
Abstract :
Rank aggregation is the problem of generating a "consensus" ranking for a given set of rankings. When applied to the Web, this finds applications in meta-searching, spam fighting and word association techniques. The rank aggregation obtained by optimizing the Spearman footrule distance is called footrule optimal aggregation (FOA), and it also satisfies the Condorcet property. We find in literature a polynomial time algorithm to compute the FOA for full lists. However, when dealing with the results from various search engines, the lists are almost invariably always the partial ones. The FOA for partial lists is NP-hard. This NP-hard nature of partial footrule optimal aggregation problem (PFOA) motivates us to apply genetic algorithm (GA) for the PFOA problem. The GA based technique may take long to compute, but we propose to decide upon the number of generations of GA based on the time limit allowed by the user. Moreover, the inherent parallelism of GA is also utilized to speed up the processing. For GA based rank aggregation, we depart from having the pool of chromosomes represented in binary to those represented in decimal itself. We achieve crossover by carrying out multiplication of permutations. For mutation, the to-be-mutated digit is exchanged with any other randomly selected digit in that very permutation. For experimental procedure falls in line with the ones found in literature. The results for rank aggregation using genetic algorithm are much better, as compared to the ones obtained using the classical Borda\´s method of rank aggregation.
Keywords :
Internet; genetic algorithms; parallel algorithms; search engines; Bordas method; Condorcet property; NP hard problem; Spearman footrule distance; World Wide Web; chromosomes; footrule optimal aggregation; genetic algorithm; meta searching; mutation; parallel rank aggregation; parallelism; permutation; polynomial time algorithm; search engines; spam fighting; word association techniques; Application software; Biological cells; Concurrent computing; Databases; Genetic algorithms; Genetic mutations; Parallel processing; Search engines; Web pages; Web sites;
Conference_Titel :
Intelligent Sensing and Information Processing, 2004. Proceedings of International Conference on
Print_ISBN :
0-7803-8243-9
DOI :
10.1109/ICISIP.2004.1287688