Insecure casting in F # with zero copy semantics

I am trying to achieve a static cast, such as coercion, which does not result in copying any data. A naive static cast does not work.

let pkt = byte_buffer :> PktHeader 

FS0193: Type Limit Mismatch. Type byte [] is not compatible with type PktHeader. Type "byte []" is not compatible with type "PktHeader" (FS0193) (program)

where the packet is initially stored in a byte array due to how System.Net.Sockets.Socket.Receive () is determined. The structure of a low-level packet is defined like this:

 [<Struct; StructLayout(LayoutKind.Explicit)>] type PktHeader = [<FieldOffset(0)>] val mutable field1: uint16 [<FieldOffset(2)>] val mutable field2: uint16 [<FieldOffset(4)>] val mutable field3: uint32 .... many more fields follow .... 

Efficiency is important in this real-world scenario, because wasteful copying of data can eliminate F # as an implementation language. How do you achieve zero copy efficiency in this scenario?

EDIT November 29, my question was based on the implicit belief that insecure static casting of C / C ++ / C # is a useful construct, as if it were self-evident. However, according to the 2nd, this type of actors is not idiomatic in F #, since it is inherently an imperative language technique, fraught with danger. For this reason, I accepted the answer of V.B. where access to SBE / FlatBuffers data is made public as best practice.

+5
source share
2 answers

F # and very low-level performance optimization are not best friends, but then ... some smart people do magic even with Java, which has no value types and no real shared collections for them.

1) Recently, I am a big fan of flies. If you allow it, you can wrap an array of bytes and access the elements of the structure using offsets. C # example here . SBE / FlatBuffers even have tools to automatically create wrappers from definitions.

2) If you can stay in an unsafe context in C # to do the job, casting pointers is very simple and efficient. However, for this you need to bind an array of bytes and save its descriptor for subsequent release, or remain within a fixed keyword. If you have many small ones without a pool, you may have problems with the GC.

3) The third option is to abuse a system like .NET and use an array of bytes with IL like this (this can be encoded in F # if you insist :)):

 static T UnsafeCast(object value) { ldarg.1 //load type object ret //return type T } 

I tried this option and even had a snippet somewhere if you needed, but this approach is inconvenient for me because I do not understand its consequences for the GC. We have two objects supported by the same memory, what happens when one of them is GCed? I was going to ask a new question about this detail, will publish it soon.


The latter approach may be useful for arrays of structures, but for one structure it will close it or copy it anyway. Since the structures are on the stack and passed by value, you will probably get better results by simply hovering over byte[] in unsafe C # or using Marshal.PtrToStructure , as in the other answer here, and then copy by value. Copying is not the worst, especially on the stack, but allocating new objects and the GC is the enemy, so you need byte arrays, and this will add a lot more to the overall performance than the custom structure problem.

But if your structure is very large , option 1 could still be better.

0
source

Pure F # Approach for Conversion

 let convertByteArrayToStruct<'a when 'a : struct> (byteArr : byte[]) = let handle = GCHandle.Alloc(byteArr, GCHandleType.Pinned) let structure = Marshal.PtrToStructure (handle.AddrOfPinnedObject(), typeof<'a>) handle.Free() structure :?> 'a 

This is a minimal example, but I would recommend introducing some checks on the length of the byte array, because, as he wrote there, it will produce undefined results if you give it an byte array that is too short. You can check out Marshall.SizeOf(typeof<'a>) .


There is no pure F # solution to do a less secure conversion than this (and this is an approach prone to crashing at runtime). Alternatives may include interacting with C # to use unsafe and fixed for conversion.

Ultimately, you are asking to undermine the F # type system, which is not really intended for the language. One of the main advantages of F # is the strength of the type system and the ability to help you create statically validated code.

+2
source

Source: https://habr.com/ru/post/1236910/


All Articles