Collection Type F # for Mixed Types

Question

Collection Type F # for Mixed Types

This question comes from the person who is working on the transition from R to F #. I fully admit that my approach here may be wrong, so I'm looking for an F # way to do this. I have a situation where I want to iterate over a set of XML files, parse them and extract a few values to determine which ones need further processing. My natural slope is Map by XML data array, exampleData in this case, parsing each of them using a provider of type RawDataProvider and finally creating a Map object for each file containing parsed XML, the status value from XML, and the value of ItemId.

It turns out that the map type in F # is not like the list in R. Lists in R are essentially hash maps that can support mixed types. It seems that the map type in F # does not support storing mixed types. I found this to be incredibly useful in my R work and I was looking for F # to suit them.

Or am I thinking all wrong about this? This is a very natural way of processing data in R, so I expect that in F # there will also be a way to do this. It is assumed that I am going to conduct further analysis and add additional data elements to these collections.

Update: This seems like such a simple case that there should be an idiomatic way in F # to do this without having to determine the type of record for each step of the analysis. I updated my example to illustrate again what I'm trying to do. I want to return an array of Map objects that I have analyzed:

 type RawDataProvider = XmlProvider<"""<product Status="Good" ItemId="123" />"""> let exampleData = [| """<product Status="Good" ItemId="123" />"""; """<product Status="Bad" ItemId="456" />"""; """<product Status="Good" ItemId="789" />"""|] let dataResult = exampleData |> Array.map(fun fileData -> RawDataProvider.Parse(fileData)) |> Array.map(fun xml -> Map.empty.Add("xml", xml).Add("Status", xml.Status).Add("ItemId", xml.ItemId)) |> Array.map(fun elem -> elem.["calc1Value"] = calc1 elem["itemId"]) |> Array.map(fun elem -> elem.["calc2"] = calc2 elem.["ItemId"] elem.["calc1Value"])

+5

dictionary f # f # -data

Matthew Crews Mar 08 '16 at 18:53

source share

1 answer

scrwtp · Answer 1 · 2016-03-09T00:32:48+0000

Here's what I find almost idiomatic here - I stick to the same form as in your example so that you can match two:

 let dataResult = exampleData |> Array.map(fun fileData -> RawDataProvider.Parse(fileData)) |> Array.map(fun xml -> xml, calc1 xml.ItemId) |> Array.map(fun (xml, calcedValue1) -> xml, calcedValue1, calc2 xml.ItemId calcedValue1)

What XmlProvider really gives you is not just xml parsing, but the fact that it generates a strongly typed xml representation. This is better than transferring data to a card, as it gives you more reliable guarantees regarding the correctness of your program. For example, this will not allow you to mix itemId and itemId , as it was in the code snippet;)

For the values that you calculate in the next steps, you can use tuples instead of writing. In general, records are preferable to tuples, since they lead to more readable code, but combining related values of different types into special aggregates is really a script that uses syntax code.

Now I said almost idiomatically - I would decompose the parsing and processing of the processed xmls into separate functions, and calculate both calc1 and calc2 leads to one function instead of compiling two Array.maps as follows:

 let dataResult = parsedData |> Array.map(fun xml -> let calced1 = calc1 xml.ItemId xml, calced1, calc2 xml.ItemId calced1)

If you come from background R, you can check out Deedle for an alternative approach. It gives you a workflow similar to R in F #.

Collection Type F # for Mixed Types

More articles: