The
range of possible values of A is partitioned into five bins [0, 20), [20, 40).... A ???1-
bit??? indicates that the attribute value falls into a specific bin. On the contrary, a
???0-bit??? indicates that the attribute value does not fall into the specific bin. Take the
example of evaluating the query ???Count the number of rows where 37 <= A < 63.???
Bitmap Indices for Data Warehouses 6
Copyright ?© 2007, Idea Group Inc. Copying or distributing in print or electronic forms without written permission
of Idea Group Inc. is prohibited.
The correct result should be 2 (rows 5 and 7). We see that the range in the query
overlaps with bins 1, 2, and 3. We know for sure that all rows that fall into bin 2
definitely qualify (i.e., they are hits). On the other hand, rows that fall into bins 1
and 3 possibly qualify and need further verification. In this case, we call bins 1 and
3 edge bins. The rows (records) that fall into edge bins are candidates and need to
be checked against the query constraint.
In our example, there are four candidates, namely rows 1 and 3 from bin 1, and rows
5 and 6 from bin 3. The candidate check process needs to read these four rows from
disk and examine their values to see whether or not they satisfy the user-specified
conditions. On a large dataset, a candidate check may need to read many pages and
may dominate the overall query response time (Rotem et al.
Pages:
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327