DocumentCode
3320176
Title
Some Results about Mutual Information-based Feature Selection and Fuzzy Discretization of Vague Data
Author
Sánchez, Luciano ; Suárez, M. Rosario ; Villar, J.R. ; Couso, Inés
Author_Institution
Oviedo Univ., Gijon
fYear
2007
fDate
23-26 July 2007
Firstpage
1
Lastpage
6
Abstract
Algorithms for preprocessing databases with incomplete and imprecise data are seldom studied, partly because we lack numerical tools to quantify the interdependency between fuzzy random variables. In particular, many filter-type feature selection algorithms rely on crisp discretizations for estimating the mutual information between continuous variables, effectively preventing the use of vague data. Fuzzy rule based systems pass continuous input variables, in turn, through their own fuzzification interface. In the context of feature selection, should we rank the relevance of the inputs by means of their mutual information, it might happen that an apparently informative variable is useless after having been codified as a fuzzy subset of our catalog of linguistic terms. In this paper we propose to address both problems by estimating the mutual information with the same set of fuzzy partitions that will be used to codify the antecedents of the fuzzy rules. That is to say, we introduce a numerical algorithm for estimating the mutual information between two fuzzified continuous variables. This algorithm can be included in certain feature selection algorithms, and can also be used to obtain the most informative fuzzy partition for the data. The use of our definition will be exemplified with the help of some benchmark problems.
Keywords
computational linguistics; feature extraction; fuzzy set theory; codification; feature selection; fuzzification interface; fuzzy discretization; fuzzy random variables; fuzzy rule based systems; linguistic terms; mutual information; vague data; Data preprocessing; Fuzzy sets; Fuzzy systems; Information filtering; Information filters; Knowledge based systems; Mutual information; Partitioning algorithms; Random variables; Spatial databases;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems Conference, 2007. FUZZ-IEEE 2007. IEEE International
Conference_Location
London
ISSN
1098-7584
Print_ISBN
1-4244-1209-9
Electronic_ISBN
1098-7584
Type
conf
DOI
10.1109/FUZZY.2007.4295665
Filename
4295665
Link To Document