Abstract :
Summary form only given, as follows. Today, the storage technology is strongly required for cloud computing, big data, and so on. The size of data used in business and science is drastically increased. According to the concept of internet of things, many sensor devices are widely spread in the world and generate large amount of data. These data are aggregated to a few large-scale data center. In this way, data center has a model of reality. On the other hand, storage technology is rapidly developing. For example, the capacity of HDD was increased in recent years. SSD and other flash memory based fast devices have been developed and are popular. Such storage device technologies are used to build data centers for big data. In such large scale storage, large amount of disks are used. The cost of storage is determined by the number of disks. It is usually proportional to the number of disks. However, even if the reliability of single disk is high, the reliability of a whole system is decreased because it is reversely proportional to the number of disks. Therefore, it is important to improve the reliability of storage. In order to increase the reliability of storage, several reliable technologies are applied to storage. For example, RAID is familiar with commodity server. In cloud, P2P based replication is used. Generally, the performance of replication is better than RAID. However, the replication drastically decreases the capacity efficiency of storage. In the design of storage, we should not estimate only performance but also must consider the reliability and capacity efficiency. In this talk, I will present important points for designing the effective storage. In my Lab, we have developed Virtual Large Scale Disks (VLSD) toolkit for constructing large scale storage. I will show several evaluations that are measured by VLSD as a storage simulator.