These components look like a
processUnit, but on a higher level they have a termination status; their dependency
rules are structured in simple trees and not in graphs. Each component can then
have an associated recoveryProcessUnit, so called because it keeps the undo code
(e.g., truncate partition).
The synchronization engine keeps track (on the DBMS) of the termination status of
a single processUnit, so the engine can evaluate the status of each component and
of the entire ETL process itself. At the end, we have a status table that says if some
failure has occurred (the process needs recovery) and which component has failed.
With this information, we can recover only the failed components and, which is
08 Adzic, Fiore, & Sisto
Copyright ?© 2007, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of
Idea Group Inc. is prohibited.
important, in an automatic way. Simplifying, the synchronization engine, running
in recovery mode, has to do a reversal of the component status and then start only
the failed components (or to be more precise, the processUnits of the failed components).
In practice, there are some further complications; a component in recovery
mode can depend on a previous one that has not failed but must be rerun because it
does an initialization job (and so we have introduced the onRecovery dependence);
then the process running in recovery mode must always use a point-in-time date
and so on with other little complications.
Pages:
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224