SEARCH
0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Prev | Current Page 400 | Next

Robert Wrembel and Christian Koncilia

"Data Warehouses and Olap: Concepts, Architectures and Solutions"


With PRS, the join execution plan of Figure 5(b) would be executed without
any data exchange between nodes, but each node would need to process full
O and PS relations, which are 18 and 7.5 GB in size considering TPC-H with
100 GB (scale factor 100).
In order to avoid replicating very large relations, a modified strategy is to replicate
dimensions and partition every fact, while also co-locating LI and O:
??? Hash-partition.fact.and.replicate.dimensions.strategy.(PFRD-H): Partition
relations identified as facts by the user (LI, O, and PS in TPC-H), co-locating
LI and O. With PFRD-H, the execution plan of Figure 4b requires repartitioning
of only two datasets: the intermediate result LI-O-P-S and relation PS. The
join between LI and O is a LocalJ.
??? Workload-based partitioning (WBP): A workload-based strategy where
hash-partitioning attributes are determined based on schema and workload
characteristics. We use the strategy proposed in Furtado (2004c). The partitioning
algorithm is:
1. Dimensions:.Small dimensions are replicated into every node (and optionally
cached into memory). Nonsmall dimensions can simply be hashpartitioned
by their primary key. This is because that attribute is expected
to be used in every equi-join with facts, as the references from facts to
dimensions correspond to foreign keys.
The determination of whether a dimension is small can be cost-based or,
for simplicity, based on a user-defined threshold (e.


Pages:
388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412