SEARCH
0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Prev | Current Page 384 | Next

Robert Wrembel and Christian Koncilia

"Data Warehouses and Olap: Concepts, Architectures and Solutions"

This process is called partitioning, if the relation is not
partitioned yet, or repartitioning, if the relation is already partitioned but must be
reorganized. Both operations can be costly because they may require heavy data
exchange over the network connecting the nodes. In this work we will refer to partitioning
(and placement) not as the operation of partitioning while processing a
join but rather as an initial placement and sporadic reorganization task that decides
which relations are to be divided or replicated into nodes and which partitioning
attributes are to be used. Williams and Zhou (1998) review five major data placement
strategies (size-based, access frequency-based, and network traffic based) and
conclude experimentally that the way data is placed in a shared-nothing environment
can have considerable effect on performance. Hua and Lee (1990) use variable partitioning
(size and access frequency-based) and conclude that partitioning increases
throughput for short transactions but complex transactions involving several large
joins result in reduced throughput with increased partitioning.
Some of the most promising partitioning and placement approaches focus on query
workload-based partitioning choice (Rao, Zhang, & Megiddo, 2002; Zilio, Jhingram,
& Padmanabhan, 1994). These strategies use the query workload to determine the
Efficient and Robust Node-Partitioned Data Warehouses 20
Copyright ?© 2007, Idea Group Inc.


Pages:
372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396