SEARCH
0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Prev | Current Page 433 | Next

Robert Wrembel and Christian Koncilia

"Data Warehouses and Olap: Concepts, Architectures and Solutions"

Partitioning, that is, data from a relation
goes to different nodes, typically results in intraquery parallelism. Replication in
turn leads to interquery parallelism, as different nodes can evaluate queries in parallel.
Using these design primitives, we have the following basic alternatives for
physical design in a database cluster:
??? Data.partitioning: The most common form of data partitioning in a parallel
database environment is horizontal partitioning. With horizontal partitioning,
the tuples of a relation are divided (or declustered) among many or all nodes
of the cluster such that each tuple resides on only one node. There are several
partitioning strategies possible in order to decide which tuple is stored at what
node: round robin partitioning, hash partitioning, and range partitioning. Round
robin partitioning is the only partitioning strategy, which is not based on the
actual values of the data. Instead, assuming a cluster consisting of n nodes, the
ith tuple is simply stored on the (i mod n)-th node. In contrast, with the other
partitioning strategies one or more attributes from the given relational schema
are designated as partitioning attributes. Hash partitioning hashes each tuple on
the partitioning attributes using a hash function on the range [1, . . . , n]. Range
partitioning assigns value ranges of the partitioning attributes to certain cluster
nodes.


Pages:
421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445