DocumentCode :
2503528
Title :
Parallel Information Extraction on Shared Memory Multi-processor System
Author :
Shan, Jiulong ; Chen, Yurong ; Diao, Qian ; Zhang, Yimin
Author_Institution :
Intel China Res. Center, Intel China Res. Center, Beijing
fYear :
2006
fDate :
14-18 Aug. 2006
Firstpage :
311
Lastpage :
318
Abstract :
Text mining is one of the best solutions for today and the future´s information explosion. With the development of modern processor technologies, it will be a mass market desktop application in the many-core era. In text mining system, information extraction is a representative module and is the most compute intensive part. In this paper, we study the performance of parallel information extraction on shared memory multi-processor systems in order to gain some insights of such applications on the future´s many-core architecture. In implementation, conditional random fields (CRFs) algorithm is selected as the core of module information extraction. Based on the newest CRFs toolkit FlexCRFs, we make several serial optimizations and then parallelize it with MPI and System V. IPC/shm. We also conduct a detailed performance analysis of this parallel application on the target system
Keywords :
data mining; information retrieval; message passing; parallel processing; shared memory systems; CRFs toolkit; FlexCRF; MPI; System V IPC/shm; conditional random fields; many-core architecture; parallel information extraction; serial optimization; shared memory multiprocessor system; text mining; Bandwidth; Computer architecture; Data mining; Explosions; Humans; Internet; Performance analysis; Performance gain; Search engines; Text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing, 2006. ICPP 2006. International Conference on
Conference_Location :
Columbus, OH
ISSN :
0190-3918
Print_ISBN :
0-7695-2636-5
Type :
conf
DOI :
10.1109/ICPP.2006.58
Filename :
1690633
Link To Document :
بازگشت