Title :
Achieving magnitude order improvement in Porter stemmer algorithm over multi-core architecture
Author :
Singh, Amik ; Kumar, Naresh ; Gera, Sahil ; Mittal, Ankush
Author_Institution :
Dept. of Electron. & Comput. Eng., Indian Inst. of Technol., Roorkee, India
Abstract :
NLP search takes a long amount of time due to large size of corpus, besides there are too many hits at the server. At present, the strategy to deal with search engines is to have many thousands of servers in order to provide real-time searches. Fast alternatives are therefore sought. In this paper, we present a pioneering work in this direction by taking word stemming, a crucial aspect of search and indexing algorithms and showing how significant performance gain can be accomplished by employing multi-core architectures, which will serve the purpose of home computers in near future. We present our analysis of Porter´s stemming algorithm on Cell Broadband Engine and describe the manner in which SIMD operations can be utilized to maximize performance. Our results show that cell processors provide performance gains of over 50 times over popular Intel processors and hence possess tremendous potential for NLP-IR applications.
Keywords :
information retrieval; multiprocessing systems; natural language processing; parallel processing; search engines; word processing; Cell Broadband Engine; Intel processors; NLP search; SIMD operations; indexing algorithms; magnitude order improvement; multicore architecture; porter stemmer algorithm; search algorithms; search engines; word stemming; Algorithm design and analysis; Clustering algorithms; Computer architecture; Home computing; Indexing; Multicore processing; Natural language processing; Performance analysis; Performance gain; Search engines; Cell Processor; Information Retrieval; Natural Language Processing; Porter Stemmer; Search Engine;
Conference_Titel :
Informatics and Systems (INFOS), 2010 The 7th International Conference on
Conference_Location :
Cairo
Print_ISBN :
978-1-4244-5828-8