Improving matrix calculation in F #

I wrote code to do a basic Matrix calculation using F #. I would like to know if there are some improvements in this code to reduce the computation time. Indeed, the operations performed are quite basic (multiplication of 2 matrices and transposition mainly), but the dimensions of the matrix are high (about 10000 * 100000 ), which leads to a huge computational time (several hours).

My questions / comments are as follows:

  • Is there any way to improve the following code? There are many β€œloops” that can seriously slow down the algorithm, but I don’t know how to avoid these β€œloops”.
  • I created some initial matrix with initial values ​​of 0 and the second time I filled their elements with results. Perhaps you can avoid the first stage of initialization.

Here is the algorithm:

 // I use the #time function to calculate the calculation duration of the algorithm #time #r "Microsoft.Office.Interop.Excel" #r "FSharp.PowerPack.dll" open System open System.IO open Microsoft.FSharp.Math open System.Collections.Generic // Algorithm let matrixCalculation (matA : matrix) (matB : matrix) (matC : matrix) = // First step : Renamed the matrix A and B size to initialize the matrix "matrixCalcul" let nbrOfElementsA = matA.NumRows let nbrOfElementsB = matB.NumRows let nbrOfCaracteristicsA = matA.NumCols let nbrOfCaracteristicsB = matB.NumCols // Second step : MatB has to be transposed let tmatB = matB.Transpose // Initialisation of the final output named matrixCalcul. A weighted vector is also initialised let mutable matrixCalcul = Matrix.create (nbrOfElementsA + 1) (nbrOfElementsB + 1) 0. let mutable weightedVector = Matrix.create nbrOfCaracteristicsA 1 0. // The first column of matA and matB represents IDs, and are "copy/past" in matrixCalcul first colum and first row respectively matrixCalcul.[1.. ,0..0] <- matA.[0..,0..0] matrixCalcul.[0..0,1 ..] <- matB.[0..,0..0].Transpose // Then the core of the matrix named "matrixCalcul" can be calculated for j = 0 to (nbrOfElementsB - 1) do weightedVector <- matC * tmatB.[1..(nbrOfCaracteristicsB - 1),0..(nbrOfElementsB-1)].Columns(j,1) for i = 0 to (nbrOfElementsA - 1) do let mutable acc = matA.[0..(nbrOfElementsA - 1),1..(nbrOfCaracteristicsA-1)].Rows(i,1) * weightedVector matrixCalcul.[i+1,j+1] <- (acc.[0,0]) matrixCalcul // Two matrix generators (one for matA and matB and another one for matC) let matrixTestGeneratorAandB nbrOfElements nbrOfCaracteristics = let matrixTestGeneratedAandB = Matrix.create nbrOfElements nbrOfCaracteristics 0. |> Matrix.mapi (fun ij value -> if j = 0 then float(i + 1) elif j % 2 = 0 then 1. else 0.) matrixTestGeneratedAandB let matrixTestGeneratorC nbrOfElements nbrOfCaracteristics = let matrixTestGeneratedC = Matrix.create nbrOfElements nbrOfCaracteristics 0. |> Matrix.mapi (fun ij value -> if j = 0 then 0. elif j % 2 = 0 then 1. else 0.) matrixTestGeneratedC // Generation of matrixA, matrixB and matrixC let matrixA = matrixTestGeneratorAandB 100 179 let matrixB = matrixTestGeneratorAandB 100 639 let matrixC = matrixTestGeneratorC 178 638 // Calculation matrixCalculation matrixA matrixB matrixC 

Basically, the computation time is about 2 seconds, but if you change the number of matrixA and matrixB to 10000 , it can take an hour. Just for information, in my algorithm, the size of matrixC will remain constant, only matrices A and B can have an increasing number of rows.

If you have any ideas for improvement, I understand.

+4
source share
1 answer

From your code, it's pretty hard to figure out what you are trying to achieve. I think you want to calculate the matrix d[0..m, 0..n] as follows:

  +---------+-------------------------+ | 0.0 | b00 b10 ...... b(n-1)0 | +---------+-------------------------+ | a00 | d11 d12 ...... d1n | | a10 | d21 d22 ...... d2n | | ... | ... ... ...... ... | | ... | ... ... ...... ... | | ... | ... ... ...... ... | | a(m-1)0 | dm1 dm2 ...... dmn | +---------+-------------------------+ 

where the main part (the inner matrix d[1..m, 1..n] ) is the multiplication of the three matrices matA1 ( matA after trimming the first columns), matC and matB1 ( matB after trimming the first columns and transposed).

To understand how the matrix works, a good way is to talk about the size of the matrix. Let ra , ca , rb , cb , rc and cc denote the number of rows and columns in matA , matB and matC respectively. Multiplication refers to the number of three matrices of size ra x (ca-1) , rc x cc and (cb-1) x rb ; this makes sense if rc = ca-1 and cc = cb-1 . We got the resulting matrix d size (ra+1) x (rb+1) .

Here is my attempt without using a for loop:

 let calculate (matA : matrix) (matB : matrix) (matC : matrix) = let ra = matA.NumRows let ca = matA.NumCols let rb = matB.NumRows let cb = matB.NumCols let matrixCalcul = Matrix.zero (ra+1) (rb+1) matrixCalcul.[1.., 0..0] <- matA.[0.., 0..0] matrixCalcul.[0..0, 1..] <- matB.[0.., 0..0].Transpose matrixCalcul.[1.., 1..] <- (matA.Columns(1, ca-1) * matC) * matB.Columns(1, cb-1).Transpose matrixCalcul 

I tested with matA , matB and matC in sizes 200x279, 200x1279 and 278x1238 respectively. Two versions give the same result, and my 40x function is faster than the original. There are many reasons for this, but in general, the vector version has much better performance when it comes to calculating the matrix.

+9
source

Source: https://habr.com/ru/post/1387074/


All Articles