We have just discussed how different
encoding methods could reduce the index size and improve query response time.
Next, we describe a strategy called binning to reduce the number of bitmaps. Since
the encoding methods described before only take certain integer values as input,
we may also view binning as a way to produce these integer values (bin numbers)
for the encoding strategies.
The basic idea of binning is to build a bitmap for a bin rather than each distinct attribute
value. This strategy disassociates the number of bitmaps from the attribute
cardinality and allows one to build a bitmap index of a prescribed size, no matter
how large the attribute cardinality is. A clear advantage of this approach is that it
allows one to control the index size. However, it also introduces some uncertainty
in the answers if one only uses the index. To generate precise answers, one may
need to examine the original data records (candidates) to verify that the userspecified
conditions are satisfied. The process of reading the base data to verify the
query conditions is called candidate check (Rotem et al., 2005b; Stockinger, Wu,
& Shoshani, 2004).
A small example of an equality-encoded bitmap index with binning is given in
Figure 4. In this example we assume that an attribute A has values between 0 and
100. The values of the attribute A are given in the second leftmost column.
Pages:
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326