is prohibited.
This simple strategy allows the system to work with unavailable nodes and it is
possible to take more than one node off-line simultaneously. The major drawback is
processing efficiency when unavailability of a few nodes occur: consider a NPDW
system with N homogeneous nodes. Although normally each node contains and
processes about 1/N of the data, if one node fails, the node replacing it with a replica
will have to process twice as much data (2/N), even though all the other nodes
will process only 1/N. The replica effort is placed on a single node, even though
other nodes are less loaded.
An alternative to full replicas is to use fully partitioned replicas (FPR)??”replicas are
partitioned into as many slices as there are nodes minus one. If there are N nodes, a
replica is partitioned into N-1 slices and each slice is placed in one node. The replica
of node i is now dispersed into all nodes except node i. In order to allow up to
R nodes to become unavailable, there must be R nonoverlapping replica slice sets.
Two replicas are nonoverlapped if the equivalent slices of the two replicas are not
placed in the same node. The following placement algorithm is used:
Number nodes linearly;
The copy of the data of node i is partitioned into N-1 numbered slices, starting at 1.
For j=0 to R:
For (slice x from 1 to N-1) Place slice x in node (i+j+ x) MOD N
This strategy is the most efficient because, considering N nodes, each replica slice
has 1/(N-1) of the data and each node has to process only that fraction in excess in
case of a single node becoming unavailable.
Pages:
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424