SEARCH
0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Prev | Current Page 398 | Next

Robert Wrembel and Christian Koncilia

"Data Warehouses and Olap: Concepts, Architectures and Solutions"

The actual decision on whether to partition or
replicate relations requires a cost model that we review later.
Partitioning.Strategies
In this section we define a set of strategies that take into consideration partitioning
and replication. In the following section a generic cost model will also be presented.
Consider the TPC-H data warehouse schema of Figure 4 from TPC (1999). It contains
several large relations, which are frequently involved in joins. The schema
represents ordering and selling activity (LI-lineitem, O-orders, PS-partsupp, P-part,
S-supplier, C-customer), where relations such as LI, O, PS, and even P are quite
large. There are also two very small relations, NATION and REGION, not depicted
in the figure as they are very small and can be readily replicated into all nodes.
2 4 Furtado
Copyright ?© 2007, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of
Idea Group Inc. is prohibited.
Figure 5(a) shows a generic query Qa, and a possible ???star-join??? execution plan for
that query is shown in Figure 5(b).
Given this example schema, the challenge is how to partition, process, and provide
availability to obtain an efficient low cost, platform-independent shared-nothing
data warehouse. We wish to determine what would be a good partitioning strategy
to process queries, considering that each relation could either be fully partitioned
or replicated.


Pages:
386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410