A metadata
management repository is employed to store the different activities of a large
workflow, along with important data these processes employ.
Motivation.
All engineering disciplines employ blueprints during the design of their engineering
artifacts. Modeling in this fashion is not a task with a value, per se; as Booch, Rumbaugh,
and Jacobson (1998) mention ???we build models to communicate the desired
structure and behavior of our system ??¦ to visualize and control the system??™s architecture
??¦ to better understand the system we are building ??¦ to manage risk.???
Discussing the modeling of ETL workflows is important for several reasons. First,
the data extraction, transformation, integration, and loading process is a key part of
a data warehouse. The commercial ETL tools that are available on the market the
last few years increased their sales from US$101 million dollars in 1998 to US$210
million dollars in 2002, having a steady increase rate of approximately 20.1% each
year (Jarke, Lenzerini, Vassiliou, & Vassiliadis, 2003). The same survey indicates
that ETL tools are in the third place of the annual sales of the overall components
of a data warehouse with the RDBMS sales for data warehouses in the first place
(40% each year since 1998) and data marts (25%) in the second place.
Also, ETL processes constitute the major part of a data warehouse environment,
resulting in the corresponding development effort and cost.
Pages:
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240