Lab 5. The Block Matrix Multiplication – Cannon Algorithm

This lab would require you to work with the Cartesian topology in order to manipulate a big matrix on blocks. You have to organise size =p*p processors into a grid topology. Then the square matrix is divided up into similar square blocks which you have to send to the processors. Block i,j of the matrix is sent to processor i,j of the grid. After you complete this task you can proceed to implement the Cannon matrix multiplication.


Task 1. Block sending

-          Start from the program matrix.c and understand the functions from it.

-          Read up from the mpi tutorials how to work with the Cartesian topology.

o   How to create a topology.

o   How to manipulate coordinates.

-          Put in place code to organise the processors into a Cartesian topology.

-          Put in place code to find the 4 direct neighbours of the processor rank.

o   From the coord-s or rank find the coord-s or the other 4 processors; then translate them into ranks.

-          If processor 0 then

o   For each block (i,j) do 1) Extract the matrix bock from a and 2) Send the matrix block to Processor (i,j).

o   Do not forget to translate the coordinates into rank. 

-          Print the local matrix for each processor.

 


Task 2. Implement Cannon Algorithm for Block Matrix Multiplication

-          Partition a on blocks and assign them to the processor grids on rows è Block i,j goes to i,(j-i)%p

-          Partition b on blocks and assign them to the processor grids on columns è Block i,j goes to (i-j)%p,j

-          Repeat for p times

o   Multiply the local matrices local_a and local_b and accumulate the result in local_c.

o   Send the matrices local_a and local_b to the left and top nodes.

o   Receive the matrices local_a and local_b to the right and bottom nodes.

-          Collect the blocks of local_c on processor 0.


Note: The solution to task 2 is here.

Initial positions

Picture1

First Roll

Picture2