Title :
Fault Tolerance via Diversity for Off-the-Shelf Products: A Study with SQL Database Servers
Author :
Gashi, Ilir ; Popov, Peter ; Strigini, Lorenzo
Author_Institution :
City Univ., London
Abstract :
If an off-the-shelf software product exhibits poor dependability due to design faults, then software fault tolerance is often the only way available to users and system integrators to alleviate the problem. Thanks to low acquisition costs, even using multiple versions of software in a parallel architecture, which is a scheme formerly reserved for few and highly critical applications, may become viable for many applications. We have studied the potential dependability gains from these solutions for off-the-shelf database servers. We based the study on the bug reports available for four off-the-shelf SQL servers plus later releases of two of them. We found that many of these faults cause systematic noncrash failures, which is a category ignored by most studies and standard implementations of fault tolerance for databases. Our observations suggest that diverse redundancy would be effective for tolerating design faults in this category of products. Only in very few cases would demands that triggered a bug in one server cause failures in another one, and there were no coincident failures in more than two of the servers. Use of different releases of the same product would also tolerate a significant fraction of the faults. We report our results and discuss their implications, the architectural options available for exploiting them, and the difficulties that they may present.
Keywords :
SQL; parallel processing; program debugging; relational databases; software architecture; software fault tolerance; SQL database server; off-the-shelf software product; parallel architecture; program debug; software architecture; software fault tolerance; COTS software; Error processing; Fault tolerance; Relational databases; Reliability; and serviceability; availability; database availability; design diversity; experimental results; fault records; non-crash failures;
Journal_Title :
Dependable and Secure Computing, IEEE Transactions on
DOI :
10.1109/TDSC.2007.70208