مرکز منطقه ای اطلاع رساني علوم و فناوري - Data mining the PIMA dataset using rough set theory with a special emphasis on rule reduction

DocumentCode :

1646026

Title :

Data mining the PIMA dataset using rough set theory with a special emphasis on rule reduction

Author :

Khan, Aurangieb ; Revett, Kenneth

Author_Institution :

Dept. of CIS, Luton Univ., UK

fYear :

2004

Firstpage :

334

Lastpage :

339

Abstract :

This paper describes how rough set theory can be utilized as a tool for analyzing relatively complex decision tables like the Pima Indian Diabetes Database (PIDD). We utilized Rosetta, a public domain implementation of rough sets on the PIDD in order to determine how we could generate a predictive rule set with the highest accuracy and the fewest number of rules. Having a reduced rule set is advantageous as it provides focus on the salient attributes and makes application in clinical practice more efficient (and likely). In this paper, we report the use of a genetic algorithm based rough set approach to classification of diabetes and achieved a success rate on the test set of 83%. This classification accuracy favors highly compared to other reported results, which ranged from 65% to 75%. In addition, we were able to achieve this accuracy with less than 100 rules. The high accuracy and low rule number provides support to the use of rough sets as a data mining tool in biological databases.

Keywords :

biology computing; data mining; database management systems; genetic algorithms; rough set theory; Pima Indian Diabetes Database; biological databases; data mining; genetic algorithm; predictive rule set; rough set theory; rule reduction; Computational Intelligence Society; Data mining; Databases; Diseases; Genetics; Medical diagnostic imaging; Neural networks; Rough sets; Set theory; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Multitopic Conference, 2004. Proceedings of INMIC 2004. 8th International

Print_ISBN :

0-7803-8680-9

Type :

conf

DOI :

10.1109/INMIC.2004.1492899

Filename :

1492899

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1646026