SEARCH
0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Prev | Current Page 395 | Next

Robert Wrembel and Christian Koncilia

"Data Warehouses and Olap: Concepts, Architectures and Solutions"

In practice, the fact that nodes and processing at nodes
are not homogeneous and that both switch and nodes??™ network interfaces often
limit full duplex capability and performance means that the data communication
overhead is usually larger than the optimal case. Given the generic query processing
architecture, we focus next on partitioning alternatives.
Partitioning.vs..Replication in NPDW
Consider relations or more generically datasets R1 and R2 that must be joined by
an equi-join key as part of the execution plan: 2 1 R A R ?????? . Consider also that R1 is
fully horizontally partitioned into all nodes or into a node group. Each node out of N
should process only 1/N of the total work in order to take full advantage of parallel
execution. If both relations are partitioned by the same equi-join key, the join can be
processed as a ???Local.or.Co-located.Join??? (LocalJ) and this is the fastest alterna-
Figure 3. Basic aggregation query steps
0. Query submission:
Select sum(a), count(a), average(a), max(a), min(a),
stddev(a), group_attributes
From fact, dimensions (join)
Group by group_attributes;
4. Results collecting:
Create cached table
PRqueryX(node, suma, counta, ssuma, maxa,
mina,
group_attributes)
as ;
3. Nodes compute partial results:
Select sum(a), count(a), sum(a x a), max(a), min(a),
group_attributes
From fact, dimensions (join)
Group by group_attributes;
5.


Pages:
383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407