When M is not limited to a count-based
measure, we define a frequent (closed) itemset Y extracted from C as meaningful
if the following conditions hold: (1) Y??©D?‰ ??…, and (2) Y??©M1?‰ ??…, where M1 ??‚ M is
the set of non-count based measures. The two conditions impose the presence of at
least one dimension and one non-count measure (e.g., MIN).
From Table 1(b), one can extract the frequent (support = 41%) closed itemset {Duality
= 0, AvgAsset=2, Govern = 2} which is meaningful since the second item represents
a range value of the measure related to the average company asset. Instead,
{Internal = 3, Govern = 2} is not a meaningful one.
Association Rules
Based on the observations made earlier in this section, we define two types of association
rules: one which is computed from a data cube whose unique measure
represents a COUNT aggregate function, and the other is computed from a data cube
for which at least a measure represents an aggregate function other than COUNT
(e.g, MIN, AVERAGE). The second one must be generated from meaningful frequent
closed itemsets.
Definition 1: Let Y1 and Y2 be two non-empty subsets of members in D,
where D is a dimension set in cube C =
and Y1??© Y2 = ??…, and let
X be the set of facts supporting Y1 ???Y2. A couNt-bAseD multiDimeNsioNAl
Toward Integrating Data Warehousing with Data Mining Techniques 26
Copyright ?© 2007, Idea Group Inc.
Pages:
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491