Implement distributed prefix-sum with MPI

Any implementation for DistributedVector and DistributedVectorView is missing.

Edited by Jakub Klinkovský