Message Passing Interface[ MPI ] is a de facto standard framework for distributed computing in many HPC applications. MPI collective operations involve a group of processes communicating by message passing in an isolated context, known as a communicator. Each process is identified by its rank, an integer number ranging from 0 to P − 1, where P is the size of the communicator. All processes place the same call (SPMD fashion i.e Single Program Multiple Data) depending on the process.
MPI Reductions are among the most useful MPI operations and form an important class of computational operations.
. The operation can be
either user-specified or from the list of pre-defined operations. Usually, the
predefined operations are largely sufficient for any application.
Consider a
system where you have N processes. The goal of the game is to compute the dot
product of two N-vectors in parallel. Now the dot product of two vectors u and v
Example operation : u⋅v=u1v1+u2v2+...+uNvN . As you can imagine, this is highly
parallelizable. If you have N processes, each process i can compute the
intermediate value ui×vi. Then, the program needs to find a way to sum all of
these values. This is where the reduction comes into play. We can ask MPI to sum
all those value and store them either on only one process (for instance process
0) or to redistribute the value to every process.
MPI reduction operations fall
into three categories:
1) Global Reduction Operations:
- MPI REDUCE,
- MPI IREDUCE,
- MPI ALLREDUCE and
- MPI IALLREDUCE.
2) Combined Reduction and Scatter Operations:
- MPI REDUCE SCATTER,
- MPI IREDUCE SCATTER,
- MPI REDUCE SCATTER BLOCK and
- MPI IREDUCE SCATTER BLOCK.
3) Scan Operations:
- MPI SCAN,
- MPI ISCAN,
- MPI EXSCAN, and
- MPI IEXSCAN.
The primary idea of these operations is to collectively compute on
a set of input data elements to generate a combined output. MPI REDUCE is a
collective function where each process provides some input data (e.g., an array
of double-precision floating-point numbers). This input data is combined through
an MPI operation, as specified by the“op” parameter. Most applications use MPI
predefined operations such as summations or maximum value identification,
although some applications also utilize reductions based on user-defined
function handlers. The MPI operator “op” is always assumed to be associative.
All predefined operations are also assumed to be commutative. Applications,
however, may define their own operations that are associative but not
commutative. The “canonical” evaluation order of a reduction is determined by
the ranks of the processes in the group. However, an MPI implementation can take
advantage of associativity, or associativity and commutativity of the
operations, in order to change the order of evaluation. Doing so may change the
result of the reduction for operations that are not strictly associative and
commutative, such as floating-point addition
The following predefined operations
are supplied for MPI_REDUCE and related functions MPI_ALLREDUCE,
MPI_REDUCE_SCATTER, and MPI_SCAN.
These operations are invoked by placing the
following in op
- [ Name] Meaning
- [ MPI_MAX] maximum
- [ MPI_MIN] minimum
- [ MPI_SUM] sum
- [ MPI_PROD] product
- [ MPI_LAND] logical and
- [ MPI_BAND] bit-wise and
- [ MPI_LOR] logical or
- [ MPI_BOR] bit-wise or
- [ MPI_LXOR] logical xor
- [ MPI_BXOR] bit-wise xor
- [ MPI_MAXLOC] max value and location
- [ MPI_MINLOC] min value and location
Example 1: Get the memory on each node and perform MPI_SUM operation to calculate average Memory on the cluster.