SEARCH
0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Prev | Current Page 200 | Next

Robert Wrembel and Christian Koncilia

"Data Warehouses and Olap: Concepts, Architectures and Solutions"

To cover
these cases in great measure the main-memory support must be read/write (with a
minimal but effective concurrency control) and support not only ???select??? but also
???insert,??? ???delete,??? and ???update.??? This does not mean that main-memory support for
ETL purposes must be a commercial main-memory database. An SQL interface,
ACID properties, full data type set, and so forth are rarely useful in our context and
are expensive; an ad-hoc main-memory support often performs better than a wellstructured
main-memory DBMS.
In a data warehouse, the fact tables are big (often millions or billions of records)
because they contain detail records; consider the single carton, canned foods, and
so forth bought at the supermarket, which are useful for certain types of analysis
(basket analysis in this case) but useless for others. It could be too expensive to access
the fact table every time, so we need some form of summary or aggregation.
These aggregations are the dataset resulting from a SQL ???group by??? performed on
the fact table. The simplest way to do this is to run a SQL at the end of the loading
phase, but this approach implies a serialization and a double scan of the data (the
first read for loading, the second for aggregation). How can one avoid it? In the
loading phase, when one processes a record and the corresponding aggregate row
in memory does not exist, one can create a new one and copy the record data into
it; otherwise, one has to update the existent aggregate row with its calculation (sum,
count, etc.


Pages:
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212