??? Loading:.The final loading of the data warehouse has its own technical challenges.
A major problem is the ability to discriminate between new and existing
data at loading time. This problem arises when a set of records has to be
classified to (a) the new rows that need to be appended to the warehouse and
(b) rows that already exist in the data warehouse, but their value has changed
and must be updated (e.g., with an UPDATE command). Modern ETL tools
already provide mechanisms towards this problem, mostly through language
predicates, for example, Oracle??™s MERGE command (Oracle, 2002). Also,
Data Warehouse Refreshment 2
Copyright ?© 2007, Idea Group Inc. Copying or distributing in print or electronic forms without written permission
of Idea Group Inc. is prohibited.
simple SQL commands are not sufficient since the open-loop-fetch technique,
where records are inserted one by one, is extremely slow for the vast volume
of data to be loaded in the warehouse. An extra problem is the simultaneous
usage of the rollback segments and log files during the loading process. The
option to turn them off contains some risk in the case of a loading failure. So
far, the best technique seems to be the usage of the batch loading tools offered
by most RDBMS that avoids these problems. Other techniques that facilitate
the loading task involve the creation of tables at the same time with the creation
of the respective indexes, the minimization of interprocess wait states,
and the maximization of concurrent CPU usage.
Pages:
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249