Abstract :
An information table or a training/designing sample set is all that can be obtained to infer the underlying generation mechanism (distribution) of tuples or samples. However, how an information table is available in representation, in treatment, and in interpretation, can still be discussed. In this paper, these matters are discussed on the basis of ldquogranularity.rdquo First, an explanation is given to identify the reasons why different goals/treatments of information tables exist in some different research fields. In this stage, it will be emphasized that ldquogranularity conceptrdquo plays an important role. Next, a framework of information tables is reformulated in terms of attribute sets and tuple sets. Here, a ldquoGalois connectionrdquo helps to understand their relationship. Then, the use of ldquoclosed subsetsrdquo is proposed instead of given tuples, for efficiency and for interpretability. With a special type of closed subsets, the traditional logical DNF expression framework can be naturally extended to those with multivalues and continuous values. Last, several concepts on rough sets are reformulated using ldquovariable granularityrdquo connected to closed subsets. This paper determines how and in what points granularity can give flexibility in dealing with several problems. Through several concepts defined in this paper, some intuitions toward development of data exploration and data mining are given.
Keywords :
data mining; learning (artificial intelligence); rough set theory; Galois connection; data exploration; data mining; generation mechanism; granularity concept; information tables; rough sets; variable granularity; Books; Data mining; Data processing; Machine learning; Pattern recognition; Probability distribution; Rough sets; Space technology; Statistical distributions; Statistical learning; Extended DNF expression; Galois connection; granularity; information tables; pattern recognition; rough sets; variable granularity;