Working with blocks of data implies
that the transformation code must be structured in nested loops (an outer ???while
??? and an inner ???while ???), inside which the
transformation macros can take place. The main implementation problem is, however,
how to pass a block of data from one stage to the next and how to exploit the
waiting time correlated to I/O operations.
In Figure 6 we depict the basic scheme we adopted in our infrastructure.
We have two pools of buffers: one for input data and one for output data. A pool of
thread readers fill the input buffers; a pool of processUnits pick the filled buffers,
transform, and write them into the output buffers; and finally another pool of thread
writers downloads the filled buffers into database or flat files.
Aside from implementation details concerning the multibuffer pool, there is the
capability to link one input channel to many output channels, the possibility to
Figure 6. Schema of the transformation hub
thread pool
reader
input buffer pools
output buffer pools
input data
(from LSN, other)
elab processUnit
Loader module
thread pool
writer
0 DBMS,
flat-file, etc.
06 Adzic, Fiore, & Sisto
Copyright ?© 2007, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of
Idea Group Inc. is prohibited.
Pages:
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220