To approach your problem, you need to determine what makes a "unit of work." This "unit of work" (or task) is what each thread will perform. Once this is determined, you can reason that this unit of work should do its job.
In the case of matrix multiplication, the natural unit of work is each cell of the resulting matrix. So, given the matrix A [i, j] and B [j, k], your calculation can focus on the point product of the vector A.row (x) (point) B. column (y) for each (0<=x<i,0<=y<k) .
The next step is to present each task. The ideal structure for "feeding" tasks to threads is the queue. java.util.concurrent.BlockingQueue is such an example when synchronization work is performed under the hood. Given that you are asked to explain the synchronization "manually", you can use another container, such as List (or even an array). Your structure will contain each cell that defines the resulting matrix. Maybe something like this:
class Cell;
Now you need the task given by the cell and the matrices A and B, can calculate the value of this cell. This is your unit of work and therefore what works in the context of the stream. Here you also need to decide if you want the result to be placed. In java, you can use futures and collect your matrix outside the context of the stream, but to keep things simple, I use an array that will contain the results. (Since, by definition, there will be no collisions)
class DotProduct implements Runnable { int[][] a; int[][] b; int[][] result; List<Cell> cells; public DotProduct(int[][] a, int[][] b, int[][]result, List<Cell> cells) { ... } public void run() { while(true) { Cell cell = null; synchronized(cells) { // here, we ensure exclusive access to the shared mutable structure if (cells.isEmpty()) return; // when there're no more cells, we are done. Cell cell = cells.get(0); // get the first cell not calculated yet cells.remove(cell); // remove it, so nobody else will work on it } int x = cell.getX(); int y = cell.getY(); z = a.row(x) (dot) b.column(y); synchronized (result) { result[x][y] = z; } } }
Now you are almost done. The only thing you still need to do is create the threads, “feed them” using the DotProduct task and wait until they are completed. Note that I synchronized on result to update the result matrix. Although by definition there is no possibility of simultaneous access to the same cell (since each thread works in a different cell), you need to make sure that the result is “safely published” to other threads by explicitly synchronizing the object. You can also do this by declaring result volatile , but I'm not sure if you already covered this place.
Hope this helps to understand how to approach the concurrency issue.