SEARCH
0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Prev | Current Page 405 | Next

Robert Wrembel and Christian Koncilia

"Data Warehouses and Olap: Concepts, Architectures and Solutions"


LC for local join : ??« ??¶
a ?— ??¬ ??· ??¬ ??·
??­ ???
R
j i IR +
N N
(3)
RC non ??“ colocated data sets : 2
??« ??¶
???— ??’ ??¬ ??·
??­ ???
IR IR
i i
N N
(4)
The value IRi/N in equation (4) is the fraction of the IRi that is at each node and
IRi/N2 is the fraction of that quantity that already has the correct hash-value for that
node, therefore requiring no repartitioning.
By subtracting (3) from (2) we get the advantage of partitioning over replicating
when both datasets are co-located. However, if the datasets are not co-located, we
must subtract equation (4) from this value. If ?? is large (small available bandwidth),
this RC cost can become dominant and replication becomes the best choice.
The WBP strategy improves the performance of the system by making each node
process 1/N of relations and intermediate results as much as possible (3) and simultaneously
reducing repartitioning requirements (4) by placing datasets based on the
workload. On the other hand, PRS focuses on eliminating repartitioning requirements
(4) to handle contexts with low bandwidth, but on the other hand, it needs to
process whole relations (2). Finally, WBP-JB uses bitmaps over the nodes to avoid
the repartitioning cost (4) (and simultaneously also reducing local processing costs).
Given a cost model, a cost-based optimizer evaluates the cost of alternative execution
plans (including join orders) for alternative partitioning options (partition or
replicate relations).


Pages:
393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417