This also means that the scheduler no longer processes its input queue in a firstcome-
first-served manner. Instead, the scheduler reorders the queries in the input
queue according to the estimated benefit values. The query for which the routing
algorithm estimates the highest benefit is executed first. However, the scheduler
has to ensure that all queries are still processed. That is, it must avoid starvation
of queries with low benefit values. One approach is to tag queries with an age and
reorder the input queue according to both benefit and age of the queries. Another
one is to introduce a wait time threshold and to preferably route queries whose wait
time exceeds this threshold.
A straightforward approach to approximate the set of tuples accessed by a query
is to keep track of the relations accessed. This is simply achieved by analysing
the FROM clause of the query. Such a ???Cache Approximation by FROM Clause???
(CAF) approximates the state of component caches by the set of relations accessed
by the most recent n queries, and defines the benefit of the cache state of cluster
node C for query Q by the size of the intersection of the cache state and the FROM
clause of the query.
Example.2: Cache Approximation Routing. Assume a small cluster of
two nodes, N1 and N2, with the approximated cache states CacheStateN1
= {Region,Customer,LineItem} and CacheStateN2 = {Orders,LineItem,S
uppliers}.
Pages:
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456