SEARCH
0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Prev | Current Page 407 | Next

Robert Wrembel and Christian Koncilia

"Data Warehouses and Olap: Concepts, Architectures and Solutions"

Operation tasks are submitted
as required and resource utilization for disk access, memory and bus, processor, and
network send/receive are used to determine completion time for those tasks. For
instance, the cost of a hybrid hash-join is related to the cost of scanning the relations
from secondary storage, bucketizing them, building a hash table, and probing
into the hash table. For instance, the cost to join relations R1 and R2 considering
the individual scan costs is scanR1 + scanR2 + 2(scanR1 + scanR2) (1-q), where q
denotes the fraction of R1 whose hash-table fits in memory (Steinbrunn et al., 1997).
Disk access rates (measured in MB/sec) are then used to complete the evaluation
of the cost. Similar strategies are applied to evaluate the repartitioning cost, which
involves scanning the datasets, operating on them, assigning buffers, and sending to
destination nodes (with given network bandwidth in MB/sec). A typical number of
instructions used to process different low-level operations and to send and receive
messages (Network) were included as a parameter to the simulator (St?¶hr, M?¤rtens
& Rahm, 2000). For these experiments we used a TPC-H with 100 GB and generic
query Qa of Figure 5a, with default selectivity for attribute values (x, y, w, z ) of
(0.7, 0.7, 0.2, 0.2) respectively.
Figure 9 shows the response time (a) and speedup (b) vs.


Pages:
395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419