Access to records in csv files for calculating F #

How can I access the entries in the csv file to perform calculations on them in F #?

I can read the csv file in memory in the usual way, but as soon as I'm stuck.

Ideally, I would just create arrays from columns and then use array.map2 to do the calculations.

So, array 1 is an indicator of website usage, and column 2 is the number of users who have reached the value in column 1 (for example, 6 visits to the website), we could calculate the average number of visits by multiplying each record in an array of column 1 , an array made from column 2, and dividing by the .sum array of column 2.

I tried the csv code for Array on F # snippets, http://fssnip.net/3T , but it creates and massages for me, which is a series of string tuples.

Can anyone suggest a better approach?

EDIT: An example of a sample input will be the same: -

Visits Count 1 8 2 9 3 5 4 3 5 2 6 1 7 1 10 1 

And the output would have to return the average value of the data, in this case 2.87 (up to 2 decimal places).

EDIT 2: the current CSV output to the array I found is

  val it : seq<BookWindow> = seq [{Visits = 1; Count = 8;}; {Visits = 2; Count = 9;}; {Visits = 3; Count = 5;}; {Visits = 4; Count = 3;}; ...] 

which is not so useful for calculations ...

+4
source share
3 answers

I am creating a record type, so I can use strongly typed operations later, and then quickly read the text file in seq<myRecord> , like this code below. If I intend to reuse this liner, I usually move the method to the record as a static member fromFile . Seq is very useful if you work with large text files, as usual, it uses very little memory in this way.

change this cleaner:

 open System.IO type myRecord = { Visits: int Count: int } with static member fromFile file = file |> File.ReadLines // expose as seq<string> |> Seq.skip 1 // skip headers |> Seq.map (fun s-> s.Split '\t') // split each line into array |> Seq.map (fun a -> {Visits=int a.[0]; Count=int a.[1]}) // and create record myRecord.fromFile @"D:\data.csv" |> Seq.fold (fun (tv, tc) r -> (tv+r.Visits*r.Count, tc+r.Count))(0,0) |> (fun t -> float (fst t) / float (snd t)) //val mean : float = 2.866666667 
+6
source

It is worth adding that with providers like F # 3.0 access to CSV files is much easier. A type provider can statically set CSV statistics at compile time and generate a type for representing columns (e.g. BookWindow ), and then it displays the data types of individual columns.

For example, take a look at the article “Using Yahoo Finance Type Source” in the “Financial Modeling” section of the new Try F # website . You can write something like:

 #r "Samples.Csv.dll" // Type provider that generates schema based on CSV file located online [<Literal>] let url = "http://ichart.finance.yahoo.com/table.csv?s=MSFT" let msft = new Samples.FSharp.CsvProvider.MiniCsv<url>() // The provider automatically infers the structure and we // can access columns as properties of the 'row' object for row in msft.Data do printfn "%A %f" row.Date row.Close 

As far as I know, the latest publicly available version of the CSV provider is in the F # 3.0 Sample Pack . I have a possibly more efficient version that also handles type inference to my GitHub repository .

Once you have the data in memory, you can perform calculations using standard F # functions. For example, to calculate the average stock closing price (you can try this on Try F #), you can write:

  Seq.average [ for row in msft.Data -> row.Close ] 

This generates a list with closed prices, and then calls the standard average function for numbers.

+6
source

You are probably too complicated, and this is not the cleanest solution, but you can still work with what you have. Match the BookWindow type to individual arrays if this provides a good way to do your calculations.

  type BookWindow = { Visits: int Count: int } // Sample data let list = [|{Visits = 1; Count = 8;}; {Visits = 2; Count = 9;}; {Visits = 3; Count = 5;}|] let visitcol = list |> Array.map (fun r -> r.Visits) let countcol = list |> Array.map (fun r -> r.Count) Array.map2( fun vc -> v * c) visitcol countcol 
+2
source

Source: https://habr.com/ru/post/1441495/


All Articles