Title :
Data mining on PC cluster connected with storage area network: its preliminary experimental results
Author :
Oguchi, Masato ; Kitsuregawa, Masaru
Author_Institution :
Inst. of Ind. Sci., Univ. of Tokyo, Japan
Abstract :
Personal computer/workstation (PC/WS) clusters have become a hot research topic in the field of parallel and distributed computing. They are considered to play an important role as a large scale computer system, such as large server sites and/or high performance parallel computers, because of their good scalability and cost performance ratio. In the viewpoint of applications, data intensive applications such as data mining and ad-hoc query processing in databases are considered very important for massively parallel processors, in addition to the conventional scientific calculation. Thus, investigating the feasibility of such applications on a PC cluster is meaningful. A PC cluster connected with a storage area network (SAN) is built and evaluated. For disk-to-disk copy operation, SAN clusters are much better than LAN clusters. A data mining application is implemented on the cluster. This application requires iterative scans of shared disks, which degrade the execution performance due to I/O-bottleneck. In order to resolve the problem, a dynamic data copy method is proposed and evaluated. This method prevents the performance degradation caused by shared disk bottleneck in SAN clusters
Keywords :
data mining; local area networks; microcomputer applications; parallel architectures; parallel processing; query processing; workstation clusters; I/O-bottleneck; LAN clusters; PC cluster; ad-hoc query processing; cost performance ratio; data intensive applications; data mining; databases; disk-to-disk copy operation; distributed computing; dynamic data copy method; high performance parallel computers; large scale computer system; massively parallel processors; parallel computing; personal computer/workstation clusters; server sites; shared disk bottleneck; storage area network; Application software; Concurrent computing; Data mining; Degradation; Distributed computing; High performance computing; Large-scale systems; Microcomputers; Storage area networks; Workstations;
Conference_Titel :
Communications, 2001. ICC 2001. IEEE International Conference on
Conference_Location :
Helsinki
Print_ISBN :
0-7803-7097-1
DOI :
10.1109/ICC.2001.937036