Combustion Dataset
The combustion dataset is from a simulation of the auto-ignition of turbulent
Hydrogen-air mixture from the TeraScale High-Fidelity Simulation of Turbulent
Combustion with Detailed Chemistry (Tera Scale Combustion, 2005). The dataset
consists of 24 million records with 16 attributes each. For this dataset we built
equality-encoded and range-encoded bitmap indices with various numbers of equidepth
bins. Figure 6 shows the average size of the compressed bitmap indices per
attribute. We can see that equality-encoded bitmap indices with 1000 bins and the
range-encoded bitmap indices with 100 bins have about the same size as the base
data. Note that the size of an uncompressed bitmap index with 100 bins is about 3
times as large as the base data. With 1000 bins, the size of the uncompressed bitmap
index is about 30 times larger. This shows that the WAH compression algorithm
works well on this dataset.
High-Energy Physics Dataset
Our second dataset is from a high-energy physics experiment at the Stanford Linear
Accelerator Center. It consists of 7.6 million records with 10 attributes. Figure
7 shows the size of the compressed bitmap indices. We notice that the size of the
range-encoded bitmap index with 100 bins is about twice as large as the base data.
The equality-encoded bitmap index with 1000 bins is about 30% smaller than the
2 Stockinger & Wu
Copyright ?© 2007, Idea Group Inc.
Pages:
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339