مرکز منطقه ای اطلاع رساني علوم و فناوري - Data intensive parallel feature selection method study

DocumentCode :

1797338

Title :

Data intensive parallel feature selection method study

Author :

Zhanquan Sun ; Zhao Li

Author_Institution :

Shandong Provincial Key Lab. of Comput. Network, Shandong Comput. Sci. Center, Jinan, China

fYear :

2014

fDate :

6-11 July 2014

Firstpage :

2256

Lastpage :

2262

Abstract :

Feature selection is an important research topic in machine learning and pattern recognition. It is effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving result comprehensibility. With the development of computer science, data deluge occurs in many application fields. Classical feature selection method is out of work in processing large-scale dataset because of expensive computational cost. This paper mainly concentrates on the study of data intensive parallel feature selection method. The parallel feature selection method is based on MapReduce program model. In each map node, a novel method is used to calculate the mutual information and combinatory contribution degree is used to determine the number of selected features. In each epoch, selected features of all map nodes are collected to a reduce node and from which a feature is selected through synthesization. The parallel feature selection method is scalable. The efficiency of the method is illustrated through an example analysis.

Keywords :

feature selection; parallel programming; MapReduce program model; combinatory contribution degree; computational cost; data deluge; data intensive parallel feature selection method; dimensionality reduction; epoch; irrelevant data removal; large-scale dataset processing; learning accuracy improvement; map node collection; mutual information; node reduction; result comprehensibility improvement; synthesiation; Computational modeling; Entropy; Joints; Mutual information; Support vector machines; Training; Vectors; Feature selection; MapReduce; contribution degree; mutual information;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks (IJCNN), 2014 International Joint Conference on

Conference_Location :

Beijing

Print_ISBN :

978-1-4799-6627-1

Type :

conf

DOI :

10.1109/IJCNN.2014.6889409

Filename :

6889409

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1797338