This is the ???Replicated.
Join??? (ReplicaJ). In a replicated join, the expression 2 1 R A R ?????? is processed
as ( 2 11 R R A ?????? ) U ??¦ U ( 2 1 R R A n ?????? ). LocalJ requires the datasets involved in the
join to be co-located. When trying to co-locate partitions from multiple relations,
the partitioning issue that arises is that it is often necessary to choose which join
will be co-located. For example, consider the join 3 2 1 R R R B A ?????? ?????? . In this case
R2 will either be partitioned on A, in which case it will be co-located with R1, or
on B, in which case it will be co-located with R3 (we can also partition R2 by both
attributes, but this does not result in co-location).
In multidimensional schemas of data warehouses, the partitioning issue is raised as
some relations (e.g., facts) typically hold several foreign keys to other relations (e.g.,
dimensions). Furtado (2004c) searches partitioning keys for facts that increase the
amount of LocalJ as opposed to RpartJ by looking at the query workload.
If the interconnections are slow or the available bandwidth is small, a replication
strategy using ReplicaJ may be preferable, as it requires no or little data exchange
between nodes. Processing with replicas follows the logic of the ???partition and
replicate strategy??? (PRS) (Yu et al., 1989), where a single relation is partitioned
and the remaining ones replicated.
Pages:
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409