3.7. Staging Task Output Data

Upon completion, tasks have often creates some amount of data. We have seen in Obtaining Task Details how we can inspect the task’s stdout string, but that will not be useful beyond the most trivial workloads. This section shows how to stage the output data of tasks back to the RP application, and/or to arbitrary storage locations and devices.

In principle, output staging is specified as the input staging discussed in the previous section:

  • source: what files need to be staged from the context of the task that
    terminated execution;
  • target: where should the files be staged to;
  • action: how should files be staged.

Note that in this example we specify the output file name to be changed to a unique name during staging:

for i in range(0, n):
    cud.executable     = '/bin/cp'
    cud.arguments      = ['-v', 'input.dat', 'output.dat']
    cud.input_staging  = ['input.dat']
    cud.output_staging = {'source': 'output.dat',
                          'target': 'output_%03d.dat' % i,
                          'action': rp.TRANSFER}

06_task_output_data.py is an example application which uses the code block above.

3.7.1. Running the Example

The result of this example’s execution shows that the output files have been renamed during the output-staging phase:


3.7.2. What’s Next?

We look into an optimization which is important for a large set of use cases: the sharing of input data among multiple tasks.