This weak model gives one major chances to achieve
a good solution suited for the specific case.
With these concepts in mind, we developed an infrastructure written in C for the
best performance, using OCI2 to access Oracle for flexibility and OS API Posix3 for
Figure 4. ETL infrastructure modules
operat ng system
DBMS
OS access layer DBMS access layer
l sten ng
funct ons
n-memory
lookup
funct ons
transform&load
funct ons
modular zat on/ worflow management
aggregat on
funct ons
operat on & ma ntenance support
other
appl cat on code
nfrastructure modules
OS layer
DBMS
appl cat on code
Extraction, Transformation, and Loading Processes
Copyright ?© 2007, Idea Group Inc. Copying or distributing in print or electronic forms without written permission
of Idea Group Inc. is prohibited.
portability. We also decided to give a modular structure and utilize, where possible,
a declarative approach. From the design point of view, whatever infrastructure or
tool used is generally better than a fully programmatic approach, in that it constrains
to provided guidelines or formalisms.
Modularization and Workflow Management
A typical ETL application may be decomposed in many correlated jobs; some of
them need to be executed sequentially and others may be parallelized. In certain
cases, a job can start only on a certain condition, and so forth.
Pages:
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207