SEARCH
0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Prev | Current Page 326 | Next

Robert Wrembel and Christian Koncilia

"Data Warehouses and Olap: Concepts, Architectures and Solutions"

With binning, one may
Figure 6. Size of base data compared with bitmap indices
Note: EE = equality encoding; RE = range encoding; lit = literal (no compression); comp = with compression
.E+0
.E+08
.E+0
.E+ 0
base data EE- 000-lit EE- 000-
comp
RE- 00-lit RE- 00-
comp
Base data and indices for combustion data set
Size [bytes]
Bitmap Indices for Data Warehouses
Copyright ?© 2007, Idea Group Inc. Copying or distributing in print or electronic forms without written permission
of Idea Group Inc. is prohibited.
use many thousands of bins and the maximum index size would still be no more
than 2N. Since a number of commercial implementations of B-trees are observed to
use 3N to 4N words, the maximum size of compressed bitmap indices is relatively
modest. As we will show for real application data, the WAH compressed index is
often much smaller than the predicted worst-case sizes.
For WAH compression, in the worst case, about 90% of the bitmaps in a rangeencoded
bitmap index will not be compressible (Wu et al., 2006). Unless one can
tolerate very large indices or one knows beforehand that compression would be
effective, we generally recommend using no more than 100 bins for range-encode
bitmap indices. This guarantees that the size of the bitmap index is at worst the
size of a B-tree.
Index Size for Real Application Datasets
We will now analyze experimentally the size of compressed bitmap indices for
various application datasets.


Pages:
314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338