SEARCH
0-9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Prev | Current Page 211 | Next

Robert Wrembel and Christian Koncilia

"Data Warehouses and Olap: Concepts, Architectures and Solutions"

In an ETL application that does not consider
the aspects of efficient managing, the interrupted DW loads cannot be defined as
???robust.??? Some works in the literature discuss the problems related to resumption
of interrupted DW loads (Labio, Wiener, Garcia-Molina, & Gorelik, 2000).
The simplest way to guarantee the data consistency in case of failure is to manage
a global rollback involving all loaded and modified data, quite easy for loaded data
(all the loaded partitions need to be truncated), a bit more difficult for the modi-
fied data (tables must be restored with the previously saved data). This approach
is functional in case of serious and complex failures, but when a problem involves
only the data portion of the ETL process (e.g., an updating of a dimensional table),
it could be unacceptable to throw out all jobs just done, especially when the entire
process requires hours.
A better way to manage partial failures is to organize the ETL process in functional
components (even useful for documentation/modularization purposes) that can
be individually recovered. Previously, we described processUnits (small blocks
of code) and the synchronization engine that starts them according to predefined
rules. This modularization is too fine-grained for recovery purposes (how can one
recover from a failure of a single processUnit instance), so we built over them a
logical container of processUnit called component.


Pages:
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223