Recording a stored instance for CString with O (1) function to get the total byte length

I am trying to write an instance of a stored vector for a CString (C characters with zero termination in my case). The stored instance will save pointers that have a CString (Ptr CChar). Thus, the length of the vector is the number of CString pointers. Now the reason I'm writing this persistent instance is because it will be used for a null copy from FFI CString, and then for quickly building a ByteString using unsafeCreate (after some conversion - so we use fast vectors here for intermediate operations) . There are three things you need to do to quickly build a ByteString for a persisted instance:

  • Total length in bytes - the memorized instance must have book-keeeping redundancy to store the length of each CString when adding it to the vector and the total length of the CString saved so far. Let's say the total length of C lines cannot exceed 2 ^ 31. Thus, Int32 / Word32 will do to keep the length of each CString and the total length.
  • The function to store the CString and its length is O (n). This function will move along the CString and keep its length, as well as increase the total length along the length of the CString.
  • Functon returns the length in bytes - O (1). This function will simply retrieve a value from a field that stores the total length

As long as I know how to write my own stored instance, I don't know how to handle this case. A simple code (can be a simple example of toys) that shows how to do custom bookkeeping, and the recording function for storing / receiving accounting results will be greatly appreciated.

Update 1 (clarification)

The reason for using a stored vector instance in my case is twofold: fast calculation / conversion using unboxed types (real-time data received via C FFI) and fast conversion to bytestring (for sending real-time data - IPC time to another program ) For fast conversion bytestring, unsafeCreate is excellent. But we need to know how much to allocate, and also pass the conversion function to it. Given the stored vector instance (with mixed types - I simplified my question above to the CString type), it’s easy for me to build a quick conversion function that moves every element of the vector and converts it to bytestring. Then we just pass it to unsafeCreate. But we also need to pass it the number of bytes to allocate. The byte length function O (n) is too slow and can double the overhead when constructing a bytestring.

+4
source share
1 answer

It sounds like you want to write something like this. Please note that this code is not verified.

-- The basic type. Export the type but not the constructors or -- accessors from the module. data StringVector { strVecLength :: Word32, -- Total length strVecContents [(Word32, Ptr CChar)] -- (Length, value) pairs } -- Invariants: forall (StringVector len contents), -- len == sum (map fst) contents -- all (\p -> fst p == c_strlen (snd p)) contents -- The null case. emptyStrVec :: StringVector emptyStrVec = StringVector 0 [] -- Put a new Cstring at the head of the vector. Analogous to ":". stringVectorCons :: Ptr CChar -> StringVector -> StringVector stringVectorCons ptr (StringVector len pairs) = StringVector (len + n) $ (n, ptr) : pairs where n = c_strlen ptr -- Or whatever the right function name is -- Extract the head of the vector and the remaining vector. stringVectorUncons :: StringVector -> ((Word32, Ptr CChar), StringVector) stringVectorUncons (StringVector len (h:t)) = (h, StringVector (len - fst h) t) 

After that, you can add any other functions that may be required, depending on the application. Just make sure that every function preserves invariants.

+1
source

Source: https://habr.com/ru/post/1385754/


All Articles