All-reduce operation

Output from multiple accelerators is reduced and combined into one output.