In this paper, we consider a model inspired by a scenario found in Grid-based scheduling systems. Scheduling is performed remotely without access to up-to-date resource availability and usage information. We model this system as a collection of queues where servers break down and are subsequently repaired. There is a delay before the scheduler learns of failures, as such requests may continue to arrive into a resource queue for some time after active service has ceased. We consider the queues to be persistent under failure, however these queues have finite capacity; therefore there is the possibility that queues become full, causing job-loss. We use stochastic process algebra and stochastic probes to analyse this model to find steady state measures and passage time distributions. The effect of the duration of any delay on information propagation on the system response time and job loss is investigated and evaluated numerically.
pubs.doc.ic.ac.uk: built & maintained by Ashok Argent-Katwala.