SEARCH
0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Prev | Current Page 392 | Next

Robert Wrembel and Christian Koncilia

"Data Warehouses and Olap: Concepts, Architectures and Solutions"

On the other hand, very small relations
can be replicated to avoid the need to repartition other very large datasets that may
need to be joined with them. In practice the decision on replication vs. partitioning
for each relation can be taken by a cost-based optimizer that evaluates alternative
execution plans and partitioning scenarios to determine the best one. Horizontallypartitioned
relations can typically be divided using a round-robin, random, range,
or hash-based scheme. We assume horizontal hash-partitioning, as this approach
facilitates key-based tuple location for parallel operation. Partitioning is intimately
related to processing issues. Therefore, first we describe generic query processing
over the NPDW. Then we focus on parallel join and partitioning alternatives.
Generic Processing over the NPDW
Query processing over a parallel shared-nothing database, and in particular over the
NPDW, follows roughly the steps in Figure 2(b). Figure 2(a) illustrates a simple sum
query example over the NPDW. In this example the task is divided into all nodes,
so that each node needs to apply exactly the same initial query on its partial data,
and the results are merged by applying a merge query again at the merging node
with the partial results coming from the processing nodes. If the datasets could be
Efficient and Robust Node-Partitioned Data Warehouses 2
Copyright ?© 2007, Idea Group Inc.


Pages:
380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404