Bitmap join indexes (O??â„¢Neil & Graefe, 1995) are very efficient materialized structures
for avoiding costly joins. When bitmap join indexes are applied to a data
warehouse schema, each bitmap indicates which fact rows correspond to each attribute
value of a dimension table, representing the precomputed result of a join
between the fact and a dimension table. Consider the simple example of a ???Sales???
fact, a ???Product??? dimension, and a ???Brand??? attribute within ???Product.??? A bitmap
for Brand ???X??? associates a bit with each row of the Sales fact with a ???1??? bit if that
row is a sale of Brand ???X??? and a ???0??? bit otherwise. A query for sales of brand ???X???
may scan the bitmap and then read only rows of Sales corresponding to that Brand.
More importantly, it also avoids the need to join Sales with Product and therefore
the need to repartition Part if it is partitioned and not co-located with Sales. In
summary, the use of early-selection and in particular bitmap join indexes reduces
the amount of data that must be exchanged very significantly, as long as there are
selective conditions on the query patterns.
Next we review replication for availability issues, as it is also a major concern in
the low-reliability environment of the NPDW.
Replication for Availability
A discussion of availability for node-partitioned data warehouses brings up several
issues like network failures, data loading failures, or availability monitoring.
Pages:
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401