SEARCH
0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Prev | Current Page 394 | Next

Robert Wrembel and Christian Koncilia

"Data Warehouses and Olap: Concepts, Architectures and Solutions"

The most common primitives are: Linear sum (LS=SUM(X));
Sum of squares (SS=SUM(X2)); Number of elements (N); and Extremes (MAX
and MIN).
Although in Figure 3 every node computes partial aggregations for all aggregation
groups, aggregation can also be computed by assigning the computation of specific
aggregation groups to specific nodes (Shatdal & Naughton, 1995). A detailed study
and evaluation of query processing issues in the NPDW is available in Furtado
(2005a).
The repartitioning operation of step 3R in Figure 2(b) is necessary whenever a partitioned
dataset needs to participate in a join but is not partitioned by the join attribute.
Each node is assigned a hash range for the join key, and every node needs to send to
Figure 2. Query processing steps in NPDW: (a) example query (b) query processing
steps
(a) (b)
2 2 Furtado
Copyright ?© 2007, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of
Idea Group Inc. is prohibited.
every other node the tuples it has that belong to the hash-range corresponding to that
node. It should be implemented as efficiently as possible to minimize the cost. We
assume a switched network (the cost of repartitioning would be larger on a sharedmedia
hub-based network). A simple parallel repartitioning algorithm would be:
Number the N nodes sequentially;
For (i=1;iParallel: every node j sends data to node (j+i) mod N;
The objective of this algorithm is for nodes to exchange data in parallel, to reduce
the repartitioning overhead.


Pages:
382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406