Abstract :
Grid computing can exploit distributed, underutilized or not, resources to provide massive parallel CPU capacity. Load balancing, applications sharing, as well as geographically dispersed databases features are other Grid´s aspects which are of interest for a telecommunications operator (Telco). Building a Grid middleware in order to implement Telco´s services is thus a way to assess the validity of this type of architecture for future applications. To achieve a trustworthy platform, the middleware needs to take into account accidental or malicious faults which can impact different resilience aspects. This paper describes a secure and highly available architecture which, besides traditional Grid middleware functionalities (resource broker, job mapping, system monitoring, ...), makes use of fault-tolerant mechanisms (process duplication, failure handling, ...) to guarantee QoS defined in the service level agreement. Security is carried out by analyzing each node´s defense capability issue and finding a suitable solution to match this with the appropriate user´s job.