SEARCH
0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Prev | Current Page 322 | Next

Robert Wrembel and Christian Koncilia

"Data Warehouses and Olap: Concepts, Architectures and Solutions"

Chan and Ioannidis (1999) showed that range encoding is the best bitmap
encoding technique for range-queries. However, range encoding might not always
be practical for high-cardinality attributes or for a large number of bins. As we will
show in the next section, range-encoded bitmap indices do not compress as well as
equality-encoded bitmap indices.
The general rule for choosing the number of bins is as follows: The more bins, the
less work during the candidate check. The reason is fairly straightforward. In general,
as the number of bins increases, the number of candidates per bin decreases.
Let us consider the following example. Assume the base data follows a uniform
random distribution. With a typical page size of 8KB, using the projection index,
Bitmap Indices for Data Warehouses 6
Copyright ?© 2007, Idea Group Inc. Copying or distributing in print or electronic forms without written permission
of Idea Group Inc. is prohibited.
a page could hold 2048 4-byte words. If one in 1000 words is accessed during the
candidate check, it is likely that every page containing the attribute values would
be touched (Stockinger et al., 2004). We, thus, suggest using 1000 bins or more.
For equality encoding there is an additional trade-off, namely using more bins may
also increase the cost of the index scan. For range encoding the cost of the index scan
is not significantly affected by the number of bins because one needs to access no
more than two bitmaps to evaluate a range query (Chan & Ioannidis, 1999).


Pages:
310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334