F # is much slower than Ocaml to handle complex keys like int * int * int in data structures

Question

F # is much slower than Ocaml to handle complex keys like int * int * int in data structures

I converted the Ocaml program to F #, and the overall performance is the same as Ocaml.

However, to get to this point, I tried to replace the exceptions with Option values.

The program works with a list, maps, etc., which has int*int*int (= tuples of three ints ).

I have a performance leak I don't understand. Can anyone explain how I can have 90% of my total runtime inside a function called

 System.Tuple`3.System.Collections.IStructuralEquatable.Equals( object, class System.Collections.IEqualityComparer)

and what can i do about it?

+4

performance profiling f # ocaml

mattias May 15, '13 at 5:49

source share

1 answer

Paul westcott · Answer 1 · 2013-06-12T05:22:37+0000

Well, as people noted, it’s rather difficult to diagnose a problem without code, but I will do my best :-)

Your question made me run a test that I had been planning for some time to test the performance of reference types and value types as keys of associative containers. My hypothesis, based on vague feelings about .net runtime, was that for small key sizes (3 types) that the value type would win. I seem to be wrong ([edit] in fact, further testing turned out to be correct! [/ Edit])

Look at some code (as usual, a microefficiency test, so take it with salt):

I used 5 different containers of different tastes for storing ints (F # is good enough to create equalities and comparisons for structure types as well as records):

 type KeyStruct(_1':int, _2':int, _3':int) = struct member this._1 = _1' member this._2 = _2' member this._3 = _3' end type KeyGenericStruct<'a>(_1':'a, _2':'a, _3':'a) = struct member this._1 = _1' member this._2 = _2' member this._3 = _3' end type KeyRecord = { _1 : int; _2 : int; _3 : int } type KeyGenericRecord<'a> = { _1 : 'a; _2 : 'a; _3 : 'a }

plus i used the original tuple (int * int * int)

Then I created the following test harness:

 let inline RunTest<'a when 'a : equality> iterationCount createAssociativeMap (createKey:_->_->_->'a) = System.GC.Collect () System.GC.WaitForFullGCComplete () |> ignore let data = [| for a in 0..99 do for b in 0..99 do for c in 0..99 do yield a,b,c |] // shuffle let r = System.Random (0) for i = 0 to data.Length-1 do let j = r.Next (i, data.Length) let t = data.[i] data.[i] <- data.[j] data.[j] <- t let keyValues = data |> Array.mapi (fun ik -> k, 0.5-(float i)/(float data.Length)) |> Array.toSeq let sw = System.Diagnostics.Stopwatch.StartNew () let mapper = createAssociativeMap createKey keyValues let creationTime = sw.ElapsedMilliseconds let sw = System.Diagnostics.Stopwatch.StartNew () let mutable checksum = 0. for i = 0 to iterationCount do let a, b, c = r.Next 100, r.Next 100, r.Next 100 let key = createKey abc checksum <- checksum + (mapper key) let accessTime= sw.ElapsedMilliseconds printfn "checksum %f elapsed %d/%d (%s)" checksum creationTime accessTime (typeof<'a>.Name) let RunNTrials<'a when 'a : equality> = RunTest<'a> 1000000

and then run it using some types of associative containers:

 let createDictionary create keyValues = let d = System.Collections.Generic.Dictionary<_,_> () keyValues |> Seq.map (fun ((_1,_2,_3),value) -> create _1 _2 _3, value) |> Seq.iter (fun (key,value) -> d.[key] <- value) (fun key -> d.[key]) let createDict create keyValues = let d = keyValues |> Seq.map (fun ((_1,_2,_3),value) -> create _1 _2 _3, value) |> dict (fun key -> d.[key]) let createMap create keyValues = let d = keyValues |> Seq.map (fun ((_1,_2,_3),value) -> create _1 _2 _3, value) |> Map.ofSeq (fun key -> d.[key]) let createCustomArray create keyValues = let maxA = 1 + (keyValues |> Seq.map (fun ((a,_,_),_) -> a) |> Seq.max) let maxB = 1 + (keyValues |> Seq.map (fun ((_,b,_),_) -> b) |> Seq.max) let maxC = 1 + (keyValues |> Seq.map (fun ((_,_,c),_) -> c) |> Seq.max) let createIndex abc = a * maxB * maxC + b * maxC + c let values : array<float> = Array.create (maxA * maxB * maxC) 0. keyValues |> Seq.iter (fun ((a,b,c),d) -> values.[createIndex abc] <- d) (fun (a,b,c) -> values.[a * maxB * maxC + b * maxC + c]) let RunDictionary<'a when 'a : equality> = RunNTrials<'a> createDictionary let RunDict<'a when 'a : equality> = RunNTrials<'a> createDict let RunMap<'a when 'a : comparison> = RunNTrials<'a> createMap let RunCustomArray = RunNTrials<_> createCustomArray

And performed the tests as follows:

 printfn "Using .net System.Collections.Generic.Dictionary" RunDictionary (fun abc -> { KeyRecord._1=a; _2=b; _3=c }) RunDictionary (fun abc -> { KeyGenericRecord._1=a; _2=b; _3=c }) RunDictionary (fun abc -> KeyStruct(a, b, c)) RunDictionary (fun abc -> KeyGenericStruct(a, b, c)) RunDictionary (fun abc -> (a, b, c)) printfn "Using f# 'dict'" RunDict (fun abc -> { KeyRecord._1=a; _2=b; _3=c }) RunDict (fun abc -> { KeyGenericRecord._1=a; _2=b; _3=c }) RunDict (fun abc -> KeyStruct(a, b, c)) RunDict (fun abc -> KeyGenericStruct(a, b, c)) RunDict (fun abc -> (a, b, c)) printfn "Using f# 'Map'" RunMap (fun abc -> { KeyRecord._1=a; _2=b; _3=c }) RunMap (fun abc -> { KeyGenericRecord._1=a; _2=b; _3=c }) RunMap (fun abc -> KeyStruct(a, b, c)) RunMap (fun abc -> KeyGenericStruct(a, b, c)) RunMap (fun abc -> (a, b, c)) printfn "Using custom array" RunCustomArray (fun abc -> (a, b, c))

And I got the following results (the checksum is just an attempt to make sure I'm not doing anything stupid), "past n / m" passed "container creation time" / {container access time}:

 Using .net System.Collections.Generic.Dictionary checksum -55.339450 elapsed 874/562 (KeyRecord) checksum -55.339450 elapsed 1251/898 (KeyGenericRecord`1) checksum -55.339450 elapsed 569/1024 (KeyStruct) checksum -55.339450 elapsed 740/1427 (KeyGenericStruct`1) checksum -55.339450 elapsed 2497/2218 (Tuple`3) Using f# 'dict' checksum -55.339450 elapsed 979/628 (KeyRecord) checksum -55.339450 elapsed 1614/1206 (KeyGenericRecord`1) checksum -55.339450 elapsed 3237/5625 (KeyStruct) checksum -55.339450 elapsed 3290/5626 (KeyGenericStruct`1) checksum -55.339450 elapsed 2448/1914 (Tuple`3) Using f# 'Map' checksum -55.339450 elapsed 8453/2638 (KeyRecord) checksum -55.339450 elapsed 31301/25441 (KeyGenericRecord`1) checksum -55.339450 elapsed 30956/26931 (KeyStruct) checksum -55.339450 elapsed 53699/49274 (KeyGenericStruct`1) checksum -55.339450 elapsed 32203/25274 (Tuple`3) Using custom array checksum -55.339450 elapsed 484/160 (Tuple`3)

Several runs showed a slight change in the number, but the main trends continued. Thus, we mainly test whether boxing occurs (on the map and dict), and then the implementation of GetHashCode () (the general version, if slower than the fully typed version, and the worst of Tuple), and in the case of Map CompareTo.

Now, when will this leave your question? Well, if all the time will be spent on Equals of the Tuple, perhaps this will help change the type of record.

But probably not :-) [because really, if it was a hash container, then GetHashCode should not cause a lot of collisions, and it was a map, then that would be CompareTo)

Anyway, in real code you will obviously have different garbage collector attacks, etc. etc., as I said, take him for what he ...

I conducted some additional testing, completing the launch of each test several times in Tasks and each separately and in parallel in parallel (starting more tasks than I have kernels), and then taking the average time to complete.

The reason for this was to take into account the collection time of the garbage collector.

When I did this, the original hypothesis of a non-shared key structure really won.

If someone is really interested and does not want to make changes, I can post the code, but in fact I think I'm the only one who reads this, so these are just my personal notes :-)

F # is much slower than Ocaml to handle complex keys like int * int * int in data structures

More articles: