Research Problems and Challenges
The previous discussion demonstrates the problem of designing an efficient, robust,
and evolvable ETL workflow is relevant and pressing. To be more specific and understand
the requirements of the design and evolution of a data warehouse, we have
to clarify how ETL workflows fit in the data warehouse life cycle.
As we can see in Figure 2, the life cycle of a data warehouse begins with an initial
reverse engineering and requirements collection phase where the data sources are
analyzed in order to comprehend their structure and contents. At the same time,
any requirements on the part of the users (normally a few power users) are also collected.
The deliverable of this stage is a conceptual model for the data stores and the
processes involved. In a second stage, namely the logical design of the warehouse,
the logical schema for the warehouse and the processes are constructed. Third, the
logical design of the schema and processes are optimized and refined to the choice
Conceptual
Model for DW,
Sources &
Processes
Logical.Design
Tuning.??“..
Full.Activity.
Description
Software.
Construction
Administration.
of.DW
Reverse. Engineering.
of.Sources.&.
Requirements.
Collection
Software &
SW Metrics
Physical Model
for DW, Sources
& Processes
Logical Model
for DW, Sources
& Processes
Metrics
Figure 2.
Pages:
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250