Title of article :
Correlation-based Attribute Selection using Genetic Algorithm
Author/Authors :
Rajdev Tiwari، نويسنده , , Manu Pratap Singh، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2010
Pages :
7
From page :
28
To page :
34
Abstract :
Integration of data sources to build a Data warehouse (DW), refers to the task of developing a common schema as well as data transformation solutions for a number of data sources with related content. The large number and size of modern data sources make the integration process cumbersome. In such cases dimensionality of the data is reduced prior to populating the DWs. Attribute subset selection on the basis of relevance analysis is one way to reduce the dimensionality. Relevance analysis of attribute is done by means of correlation analysis, which detects the attributes (redundant) that do not have significant contribution in the characteristics of whole data of concern. After which the redundant attribute or attribute strongly correlated to some other attribute is disqualified to be the part of DW. Automated tools based on the existing methods for attribute subset selection may not yield optimal set of attributes, which may degrade the performance of DW. Various researchers have used GA, as an optimization tool but most of them use GA to search the optimal technique amongst the available techniques for attribute selection. This paper formulates and validates a method for selecting optimal attribute subset based on correlation using Genetic algorithm (GA), where GA is used as optimal search tool for selecting subset of attributes..
Keywords :
Data Source Integration , Correlation analysis , Relevance Analysis , Genetic algorithm , Attribute subset , Data warehouse
Journal title :
International Journal of Computer Applications
Serial Year :
2010
Journal title :
International Journal of Computer Applications
Record number :
659906
Link To Document :
بازگشت