I struggled with the following code. This is the implementation of the Forward-Euler F # algorithm used to model stars moving in a gravitational field.
let force (b1:Body) (b2:Body) = let r = (b2.Position - b1.Position) let rm = (float32)r.MagnitudeSquared + softeningLengthSquared if (b1 = b2) then VectorFloat.Zero else r * (b1.Mass * b2.Mass) / (Math.Sqrt((float)rm) * (float)rm) member this.Integrate(dT, (bodies:Body[])) = for i = 0 to bodies.Length - 1 do for j = (i + 1) to bodies.Length - 1 do let f = force bodies.[i] bodies.[j] bodies.[i].Acceleration <- bodies.[i].Acceleration + (f / bodies.[i].Mass) bodies.[j].Acceleration <- bodies.[j].Acceleration - (f / bodies.[j].Mass) bodies.[i].Position <- bodies.[i].Position + bodies.[i].Velocity * dT bodies.[i].Velocity <- bodies.[i].Velocity + bodies.[i].Acceleration * dT
While this works, it is not entirely "functional." It also suffers from terrible performance, it is 2.5 times slower than the equivalent C # code. body - an array of structures of type Body.
What I'm struggling with is that force () is an expensive function, so you usually calculate it once for each pair and rely on Fij = -Fji. But it really confuses any deployment cycle, etc.
Suggestions are gratefully received! No, this is not homework ...
Thanks,
Ade
UPDATED: For clarification, Body and VectorFloat are defined as C # structures. This is because the program interacts between F # / C # and C ++ / CLI. In the end, I'm going to get the code on BitBucket, but this is a work in progress. I have some problems to sort out before I can express it.
[StructLayout(LayoutKind.Sequential)] public struct Body { public VectorFloat Position; public float Size; public uint Color; public VectorFloat Velocity; public VectorFloat Acceleration; ''' } [StructLayout(LayoutKind.Sequential)] public partial struct VectorFloat { public System.Single X { get; set; } public System.Single Y { get; set; } public System.Single Z { get; set; } }
A vector defines the type of operators that you expect from the standard Vector class. You can probably use the Vector3D class from the .NET platform for this case (I'm actually investigating it).
UPDATE 2: Improved code based on the first two answers below:
for i = 0 to bodies.Length - 1 do for j = (i + 1) to bodies.Length - 1 do let r = ( bodies.[j].Position - bodies.[i].Position) let rm = (float32)r.MagnitudeSquared + softeningLengthSquared let f = r / (Math.Sqrt((float)rm) * (float)rm) bodies.[i].Acceleration <- bodies.[i].Acceleration + (f * bodies.[j].Mass) bodies.[j].Acceleration <- bodies.[j].Acceleration - (f * bodies.[i].Mass) bodies.[i].Position <- bodies.[i].Position + bodies.[i].Velocity * dT bodies.[i].Velocity <- bodies.[i].Velocity + bodies.[i].Acceleration * dT
The difference in power function to cover the case b1 == b2 is the worst offender. You do not need this if softeningLength is always non-zero, even if it is very small (Epsilon). This optimization was in C # code, but not in the F # version (doh!).
Math.Pow (x, -1.5) seems to be much slower than 1 / (Math.Sqrt (x) * x). In fact, this algorithm is a bit strange, because its performance is dictated by the cost of this one step.
Moving the calculation of the force to the line and getting rid of some divisions also gives some improvement, but the performance was really killed by branching and the Sqrt value dominates in it.
WRT using classes over structures: There are cases (CUDA and C ++ native implementations of this code and DX9 rendering) where I need to get an array of bodies in unmanaged code or on a GPU. In these memcpy capable scripts, the contiguous block of memory seems to be what it wants. Not what I get from an array of the Body class.