This example shows how task dependency is used in a two stage pipeline
application. The problem is a simple simulation.
An object P is placed in the middle of a flat surface with a bounding rectangular
box. On each simulation step, the object moves in a random distance in a random
direction. It moves back to the initial position when it hits the side walls
of the bounding box. The problem is to calculate the number of hits to the
four walls in a given time period.
Figure 1. Object P
randomly hits the side wall of the bounding box
A two stage pipeline is used to solve the problem so that the random number
generation and the simulation can be paralleled:
- The first stage generates random numbers using a pseudo random number
generator
- The second stage simulates the movements
Because ALF currently does not support pipeline directly, a pipeline
structure is simulated using task dependency. There are two tasks which correspond
to the two pipeline stages.
For this problem, each simulation step only needs a small amount of data
just as a motion vector. Although ALF does not have a strict limit on how
small the data can be, it is better to use larger data blocks for performance
considerations. Therefore, the data for thousands of simulation steps is grouped
into a single work block.
Stage 1 task: For the stage 1 task, a Lagged Fibonacci pseudo random
number generator (PRNG) is used for simplicity. In this example, the algorithm
is as follows:
Sn=(Sn-j^Sn-k)%232
where
k >
n >
0 and
k = 71,
j = 65
The algorithm requires a length k history buffer to save
the older values. In this implementation, the task context is used for the
history buffer. Because no input data is needed, the work block for this task
only has output data.
Stage 2 task: For the stage 2 task, the task context is used to
save the current status of the simulation including the position of the object
and the number of hits to the walls. The work block in this stage only has
input data, which are the PRNG results from stage 1.
Another target of pipelining is to overlap the execution of different stages
for performance improvement. However, this requires work block level task
synchronization between stages, and this is not yet supported by ALF. The
alternative approach is to use multiple tasks whereby each task only handles
a percentage of the work blocks for the whole simulation.
So there are now two stage tasks. For each chunk of work blocks, the following
two tasks are created:
- The stage 1 task generates the random numbers and writes out the results
to a temporary buffer
- The stage 2 task reads the random numbers from the temporary buffer to
do the simulation
A task dependency is set between the two tasks to make sure the stage 2
task can get the correct results from stage 1 task. Because both the PRNG
and the simulation have internal states, you have to pass the states data
between the succeeding tasks of the same stage to preserve the states. The
approach described here lets the tasks for the same stage share the same task
context buffer. Dependencies are used to make sure the tasks access the shared
task context in the correct order.
Figure 2 (a) shows the task dependency
as described in previous discussions. To further reduce the use of temporary
intermediate buffers, you can use double or multi-buffering technology for
the intermediate buffers. The task dependency graph for double buffering the
intermediate buffers is shown in
Figure 2 (b),
where a new dependency is added between the n-2th stage 2 task and the nth
stage 1 task to make sure the stage 1 task does not overwrite the data that
may still be in use by the previous stage 2 task. This is what is implemented
in the sample code.
Figure 2. Task dependency
examples
Source code
The complete source code
can be found in the sample directory pipe_line.