Conclusion
We have discussed design issues for low-cost alternatives to specialized, fast, and
fully-dedicated parallel hardware to handle large data warehouses. The idea is to
design the system with special care concerning partitioning for placement and reorganization
and also concerning availability. Alternative partitioning strategies
were proposed and their performance compared. We have tested replica-based and
partitioned-based strategies and analyzed their performance vs. the number of nodes
and available network bandwidth. We also tested the use of early selection with
join bitmaps as an approach to overcome extra overheads related to repartitioning
and overall processing. We concluded that workload-based partitioning is a suitable
strategy, and join bitmaps not only improve speedup but also prevent significant
slowdown when the available network bandwidth is low. We have also described
replication-based availability that allows always-on behavior and efficiency when
multiple nodes are taken off-line.
226 Furtado
Copyright ?© 2007, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of
Idea Group Inc. is prohibited.
Acknowledgments
This work was supported in part by the Portuguese ???Funda?§??o para a Ci??ncia e
Tecnologia,??? under project POSC/EIA/57974/2004.
References
Bernstein, P.
Pages:
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428