The replay_op structure seemed to be the most appropriate place for storing the timing information, though rsd_node would also have been a good choice. The following things were considered when making this decision.
The changes in opstruct.h are as shown below. These were marked out as explained earlier.
enum { MIN_TIME = 0, AVG_TIME = 1, MAX_TIME = 2, TIME_FIELDS = 3 }; ... ... typedef struct { ... long int time[TIME_FIELDS]; ... } replay_op;
While compressing, the min and max timing operations are trivial to compute, however the average poses a problem. The problem does not occur when we have to compress two nodes which are pure (not part of an rsd compression sequence), but when either or both of them are compressed nodes. When attempting to compress nodes a and b, node a may have resulted from the compression of two mpi calls, but b may be a pure node. The average is not a simple addition of their average times divided by two. To address this issue, we needed at compression time, to be able to determine, how many mpi calls are represented by a single node. After exploring the code as much as was pragmatic, no direct or simple mechanism for retrieving this information could be found. The data structures probably stored this information indirectly or there may have been an indirect mechanism for retrieving it (by traversing the stack or something else).
Since this information need not be part of the final trace file, I thought adding it as a field would add to the runtime overhead in space, and a small overhead in communication. This field was added to the rsd_node structure. This is the most appropriate place for it because the rsd_node represents aggregation of multiple mpi calls. So we add a field in the rsd_node structure to keep a track of the number of nodes it has aggregated.
The following field was added to rsd_node in rsd_queue.h.
typedef struct rsd_node_t { ... int numAggregated; ... } rsd_node;