Deploy nested loops in F #

I struggled with the following code. This is the implementation of the Forward-Euler F # algorithm used to model stars moving in a gravitational field.

let force (b1:Body) (b2:Body) = let r = (b2.Position - b1.Position) let rm = (float32)r.MagnitudeSquared + softeningLengthSquared if (b1 = b2) then VectorFloat.Zero else r * (b1.Mass * b2.Mass) / (Math.Sqrt((float)rm) * (float)rm) member this.Integrate(dT, (bodies:Body[])) = for i = 0 to bodies.Length - 1 do for j = (i + 1) to bodies.Length - 1 do let f = force bodies.[i] bodies.[j] bodies.[i].Acceleration <- bodies.[i].Acceleration + (f / bodies.[i].Mass) bodies.[j].Acceleration <- bodies.[j].Acceleration - (f / bodies.[j].Mass) bodies.[i].Position <- bodies.[i].Position + bodies.[i].Velocity * dT bodies.[i].Velocity <- bodies.[i].Velocity + bodies.[i].Acceleration * dT 

While this works, it is not entirely "functional." It also suffers from terrible performance, it is 2.5 times slower than the equivalent C # code. body - an array of structures of type Body.

What I'm struggling with is that force () is an expensive function, so you usually calculate it once for each pair and rely on Fij = -Fji. But it really confuses any deployment cycle, etc.

Suggestions are gratefully received! No, this is not homework ...

Thanks,

Ade

UPDATED: For clarification, Body and VectorFloat are defined as C # structures. This is because the program interacts between F # / C # and C ++ / CLI. In the end, I'm going to get the code on BitBucket, but this is a work in progress. I have some problems to sort out before I can express it.

 [StructLayout(LayoutKind.Sequential)] public struct Body { public VectorFloat Position; public float Size; public uint Color; public VectorFloat Velocity; public VectorFloat Acceleration; ''' } [StructLayout(LayoutKind.Sequential)] public partial struct VectorFloat { public System.Single X { get; set; } public System.Single Y { get; set; } public System.Single Z { get; set; } } 

A vector defines the type of operators that you expect from the standard Vector class. You can probably use the Vector3D class from the .NET platform for this case (I'm actually investigating it).

UPDATE 2: Improved code based on the first two answers below:

  for i = 0 to bodies.Length - 1 do for j = (i + 1) to bodies.Length - 1 do let r = ( bodies.[j].Position - bodies.[i].Position) let rm = (float32)r.MagnitudeSquared + softeningLengthSquared let f = r / (Math.Sqrt((float)rm) * (float)rm) bodies.[i].Acceleration <- bodies.[i].Acceleration + (f * bodies.[j].Mass) bodies.[j].Acceleration <- bodies.[j].Acceleration - (f * bodies.[i].Mass) bodies.[i].Position <- bodies.[i].Position + bodies.[i].Velocity * dT bodies.[i].Velocity <- bodies.[i].Velocity + bodies.[i].Acceleration * dT 
  • The difference in power function to cover the case b1 == b2 is the worst offender. You do not need this if softeningLength is always non-zero, even if it is very small (Epsilon). This optimization was in C # code, but not in the F # version (doh!).

  • Math.Pow (x, -1.5) seems to be much slower than 1 / (Math.Sqrt (x) * x). In fact, this algorithm is a bit strange, because its performance is dictated by the cost of this one step.

  • Moving the calculation of the force to the line and getting rid of some divisions also gives some improvement, but the performance was really killed by branching and the Sqrt value dominates in it.

WRT using classes over structures: There are cases (CUDA and C ++ native implementations of this code and DX9 rendering) where I need to get an array of bodies in unmanaged code or on a GPU. In these memcpy capable scripts, the contiguous block of memory seems to be what it wants. Not what I get from an array of the Body class.

+4
source share
2 answers

I am not sure if it is correct to rewrite this code in a functional style. I saw some attempts to write pairwise interaction calculations in functional order, and each of them was more difficult to follow than two nested loops.

Before looking at structures versus classes (I'm sure someone has something clever to say about this), maybe you can try to optimize the calculation itself?

You calculate two acceleration deltas, name them dAi and dAj:

dAi = r * m1 * m2 / (rm * sqrt (rm)) / m1

dAj = r * m1 * m2 / (rm * sqrt (rm)) / m2

[note: m1 = bodies. [i] .mass, m2 = bodies. [j] .mass]]

Mass division is canceled as follows:

dAi = rm2 / (rmsqrt (rm))

dAj = rm1 / (rmsqrt (rm))

Now you only need to calculate r / (rmsqrt (rm)) for each pair (i, j). This can be optimized because 1 / (rmsqrt (rm)) = 1 / (rm ^ 1.5) = rm ^ -1.5, so if you made r '= r * (rm ** -1.5), then Edit: no it not not that premature optimization says right there (see comment). The calculation r '= 1.0 / (r * sqrt r) is the fastest.

dAi = m2 * r '

dAj = m1 * r '

Then your code will become something like

 member this.Integrate(dT, (bodies:Body[])) = for i = 0 to bodies.Length - 1 do for j = (i + 1) to bodies.Length - 1 do let r = (b2.Position - b1.Position) let rm = (float32)r.MagnitudeSquared + softeningLengthSquared let r' = r * (rm ** -1.5) bodies.[i].Acceleration <- bodies.[i].Acceleration + r' * bodies.[j].Mass bodies.[j].Acceleration <- bodies.[j].Acceleration - r' * bodies.[i].Mass bodies.[i].Position <- bodies.[i].Position + bodies.[i].Velocity * dT bodies.[i].Velocity <- bodies.[i].Velocity + bodies.[i].Acceleration * dT 

Look, ma, no more divisions!

Warning: unverified code. Try at your own risk.

+3
source

I would like to play arround with your code, but this is difficult, as the definition of Body and FloatVector is missing, and they also seem to be missing from the original blog you are pointing to.

I would venture to suggest that you can improve your performance and rewrite it in a more functional style using F # lazy calculations: http://msdn.microsoft.com/en-us/library/dd233247(VS.100).aspx

The idea is quite simple: you transfer any costly calculation that can be repeatedly calculated in a lazy (...) expression, then you can force the calculation as many times as you want, and it will ever be calculated only once.

+1
source

Source: https://habr.com/ru/post/1299864/


All Articles