Thrust. of. the.Chapter
In this section, we identify the main problems that arise during all the phases of an
ETL process. Then, we propose a modeling approach for the construction of ETL
workflows, which is based on the life cycle of the ETL processes.
Data Warehouse Refreshment
Copyright ?© 2007, Idea Group Inc. Copying or distributing in print or electronic forms without written permission
of Idea Group Inc. is prohibited.
Problems and Issues of DW Refreshment
In all the phases of an ETL process (extraction and transportation, transformation
and cleaning, and loading), individual issues arise, making data warehouse refreshment
a very troublesome task. In the sequel, in order to clarify the complexity and
the special characteristics of the ETL processes, we briefly review several issues,
problems, and constraints that turn up in each phase separately.
??? Global problems and constraints: Scalzo (2003) mentions that 90% of the
problems in data warehouses arise during the loading of the data at the nightly
batch cycles. At this period, the administrators have to deal with problems
such as (a) efficient data loading and (b) concurrent job mixture and dependencies.
Moreover, ETL processes have global time constraints including the
initiation time and their completion deadlines. In fact, in most cases, there is
a tight ???time window??? in the night that can be exploited for the refreshment
of the data warehouse, since the source system is off-line or not heavily used
during this period.
Pages:
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245