Computability in Europe 2006
Logical Approaches to Computational Barriers

Regular Talk:
Risk Management in Grid Computing

Speaker: Karim Djemame
Author(s): Karim Djemame, Iain Gourlay and James Padgett
Slot: Array, 17:20-17:40, col. 5


The Grid is an approach to high-performance and large scale networked
computing that has the potential to dramatically alter how computing
resources are allocated for large projects. Built on the Internet and
the World Wide Web, it is a class of infrastructure comprising a set
of high-speed computers, storage systems and networks, plus a set of
Grid services (or middleware) to coordinate the ensemble of
resources. By providing scalable, secure, high-performance mechanisms
for discovering and negotiating access to remote resources, the Grid
promises to make it possible for scientific collaborations to share
resources on an unprecedented scale, and for geographically
distributed groups to work together in ways that were previously

Grid technologies have reached a high level of development, but
adopters underline core shortcomings related to security,
trustworthiness, and dependability of the Grid for commercial
applications and services. Users require a job execution with the
desired priority and quality and negotiate Service Level Agreements
(SLAs) to define all aspects of the business relationship.
Nevertheless, providers are still cautious on adoption as agreeing on
SLAs including penalty fees is a business risk: for example a system
failure can lead to SLA violation. Providers need risk assessment
methods as decision support for accepting/rejecting SLAs, for
price/penalty negotiation, for activating fault-tolerance actions,
and for capacity and service planning. Grid end-users need the
estimation and aggregated confidence information for provider
selection and fault-tolerance/penalty negotiations.

This research (recently funded by the European Commission) addresses
risk awareness and consideration in SLA negotiation, self-organising
fault-tolerant actions, and capacity planning. It will develop and
integrate methods for risk management in all Grid layers. The corner
stones are risk management scenarios reflecting the perspective of
Grid end-users, resource brokers, and resource providers. The results
will support all Grid actors by increasing the transparency,
reliability, and trustworthiness as well as providing an objective
foundation for planning and management of Grid activities. Thus, this
research will supply Next Generation Grids with additional innovative
and required components to close the gap between SLAs as concept and
accepted tool for commercial Grid uptake.

This research will produce generic and interoperable open-source
software for risk assessment, risk management and decision support in
each Grid layer. The outcome quality will be demonstrated in provider
environments and in close interaction with customers.

