Abstract :
In distributed systems, an application program is divided into several software modules, which need to
be allocated to processors connected by communication links. The distributed system reliability (DSR)
could be defined as the probability of successfully completing the distributed program. Previous studies
about optimal task allocation with respect to DSR focused on the effects of the inter-connectivity of processors,
the failure rates of the processors, and the failure rates of the communication links. We are the
first to study the effects of module software reliabilities and module execution frequencies on the optimal
task allocation. By viewing each module as a state in the Markov process, we build a task allocation decision
model to maximize DSR for distributed systems with 100% reliable network. In this model, the DSR is
derived from the module software reliabilities, the processor hardware reliabilities, the transition probabilities
between modules, and the task allocation matrix. Resource constraints of memory space limitation
and computation load limitation on each processor are considered. The constraint of total system
cost, including the execution cost, the communication cost, and the failure cost, is also considered. We
solve the problem by Constraint Programming using the ILOG SOLVER library. We then apply the proposed
model to a case extended from previous studies. Finally, a sensitivity analysis is performed to verify
the effects of module software reliabilities and processor hardware reliabilities on the DSR and on the
task allocation decision.
Keywords :
Task allocation , decision model , Distributed system reliability , Markov process , Constraint programming