The remaining twelve
attributes are unordered and out of these, we do not consider K40, K250k and
K500k in our selected queries to limit the number of queries. The attributes
included for each query set are sufficient to provide a variety of cardinality
and selectivity values for that set, limiting the number of experiments at the
same time. In our simulations, we assume uniform data distribution and this
is consistent with the BENCH table. We identify a subset of the set query
benchmark that consists of document search and direct marketing queries. We
omit the management reporting queries as we do not take into consideration
aggregation and join queries in our study. The performance study here focuses
on selection costs and we simulate the selection conditions (SQL WHERE
clause) in each chosen query.
The total number of queries that we consider is 43, and from these we create
subsets based on different criteria, such as cardinality of attributes and query
selectivities. The six query sets are based on queries embodying high cardinality
attributes, very high cardinality attributes, low cardinality attributes, low
selectivity, high selectivity, and mixed queries. For each of these query sets, we
vary input parameters to study their impact, while we fix other factors to limit
the number of simulations. The values of parameters are given in Table 7.
Pages:
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372