Take some data, return pure piecewise function

Given the list of {x, y} datapoints, we return a pure function f (from real to real numbers), so f [x] == y for each {x, y} in the data. If x is not one of the x values, return the y value for the previous point (one with an x ​​value less than x). If the function receives a value less than the value of the first x-value in the data - i.e. There is no previous point - then return 0.

For example, the data {{1,20}, {2,10}} returns a pure function that looks like this:

Graph of the function defined by {{1,20}, {2,10}} http://yootles.com/outbox/so/piecewise.png

I wrote something using Function and Piecewise , which I will include as an answer, but it looks like it might be inefficient, especially for a large list of points. [UPDATE: My answer may actually be decent. I will probably go with him if anyone has no better ideas.]

To be clear, we are looking for a function that takes one argument β€” a list of pairs of numbers β€” and returns a pure function. This pure function should take a number and return a number.

+6
source share
5 answers

Manual Encoding Binary Search

If someone wants to sacrifice laconicism for performance, then the correct binary search approach is needed:

 stepifyWithBinarySearch[data_] := With[{sortedData = SortBy[data, First], len = Length @ data} , Module[{min = 1, max = len, i, x, list = sortedData} , While[min <= max , i = Floor[(min + max) / 2] ; x = list[[i, 1]] ; Which[ x == #, min = max = i; Break[] , x < #, min = i + 1 , True, max = i - 1 ] ] ; If[0 == max, 0, list[[max, 2]]] ]& ] 

Equipped with some sample forests ...

 test[s_, count_] := Module[{data, f} , data = Table[{n, n^2}, {n, count}] ; f = s[data] ; Timing[Plot[f[x], {x, -5, count + 5}]] ] 

... we can test and use various solutions:

 test[stepifyWithBinarySearch, 10] 

step plot

The following timings were received on my machine:

<Preview> test [stepify (* version 1 *), 100000] 57.034 s test [stepify (* version 2 *), 100000] 40.903 s test [stepifyWithBinarySearch, 100000] 2.902 s

I expect further performance gains to be made by compiling various functions, but I will leave this as an exercise for the reader.

Better: preliminary interpolation (response to comment from dreeves)

It is not clear that a manual, uncompiled binary search will beat the built-in function of Mathematica. Perhaps this is not so surprising for Piecewise , since by prohibiting optimization, it is actually just an illustrious expression for testing an IF-THEN-ELSEIF chain of arbitrary complexity. However, one would expect that Interpolation would be much better, since it is essentially intended for this task.

The good news is that Interpolation provides a very fast solution, provided that one organizes the calculation of interpolation only once:

 stepifyWithInterpolation[data_] := With[{f=Interpolation[ {-1,1}*#& /@ Join[{{-9^99,0}}, data, {{9^99, data[[-1,2]]}}] , InterpolationOrder->0 ]} , f[-#]& ] 

This is dazzlingly fast, and on my machine it only takes 0.016 seconds to complete test[stepifyWithInterpolation, 100000] .

+5
source

You can also do this with Interpolation (with InterpolationOrder->0 ), but this is interpolation using the value of the next point instead of the previous one. But then I realized that you can change this with a simple double negation trick:

 stepify[data_] := Function[x, Interpolation[{-1,1}*#& /@ Join[{{-9^99,0}}, data, {{9^99, data[[-1,2]]}}], InterpolationOrder->0][-x]] 
+5
source

My previous attempts did not work properly (they were only fine for two steps).

I think the following, in the same directions, works:

 g[l_] := Function[x, Total[#[[2]] UnitStep[x - #[[1]]] & /@ Transpose@ ({ First@ #, Differences[Join[{0}, Last@ #]]} &@ Transpose@l )]] Plot[g[{{1, 20}, {2, 10}, {3, 20}}][x], {x, 0, 6}] 

enter image description here

+5
source

Compiling a WReach answer does lead to significant speedup. Using all the functions defined in the WReach answer, but overriding test to

 test[s_,count_]:=Module[{data,f}, data=Table[{n,n^2}, {n,count}]; f=s[ToPackedArray[ N@data ]]; Timing[Plot[f[x],{x,-5,count+5}]]] 

(this is necessary to make the resulting arrays be packed, thanks to Sjoerd de Vries for specifying this) and definition

 ClearAll[stepifyWRCompiled]; stepifyWRCompiled[data_]:=With[{ len=Length@data ,sortedData=SortBy[data,First]}, Compile[{{arg,_Real}},Module[{min=1,max=len,i,x,list=sortedData}, While[ min<=max, i=Floor[(min+max)/2]; x=list[[i,1]]; Which[ x\[Equal]arg,min=max=i;Break[], x<arg,min=i+1,True,max=i-1 ] ]; If[0==max,0,list[[max,2]]] ],CompilationTarget->"WVM",RuntimeOptions\[Rule]"Speed"]] 

(the With block is needed to explicitly insert sortedData into the code block that needs to be compiled), we get the results faster than the solution using Interpolation , albeit slightly:

 Monitor[ tbl = Table[ {test[stepifyWRCompiled, l][[1]], test[stepifyWithInterpolation, l][[1]], test[stepifyWithBinarySearch, l][[1]]}, {l, 15000, 110000, 5000}], l] tbl//TableForm (* 0.002785 0.003154 0.029324 0.002575 0.003219 0.031453 0.0028 0.003175 0.034886 0.002694 0.003066 0.034896 0.002648 0.003002 0.037036 0.00272 0.003019 0.038524 0.00255 0.00325 0.041071 0.002675 0.003146 0.041931 0.002702 0.003044 0.045077 0.002571 0.003052 0.046614 0.002611 0.003129 0.047474 0.002604 0.00313 0.047816 0.002668 0.003207 0.051982 0.002674 0.00309 0.054308 0.002643 0.003137 0.05605 0.002725 0.00323 0.06603 0.002656 0.003258 0.059417 0.00264 0.003029 0.05813 0.00274 0.003142 0.0635 0.002661 0.003023 0.065713 *) 

(the first column is compiled binary search, the second interpolation, the third, uncompiled binary search).

Note that I'm using CompilationTarget->"WVM" , not CompilationTarget->"C" ; this is due to the fact that the function is compiled with a large number of data points "inline", and if I use compilation in C with 100,000 data points, I see that gcc lasts for a long time and takes a lot of memory (I think the resulting The C file is huge, but I did not check). So I just use compilation for "WVM".

I think the general conclusion here is that Interpolation also just does a constant time search (binary search or something similar, presumably), and manual code is just a little faster because it is less general.

+3
source

The following works:

 stp0[x_][{{x1_,y1_}, {x2_,y2_}}] := {y1, x1 <= x < x2} stepify[{}] := (0&) stepify[data_] := With[{x0 = data[[1,1]], yz = data[[-1,2]]}, Function[x, Piecewise[Join[{{0, x<x0}}, stp0[x] /@ Partition[data, 2,1]], yz]]] 

Note that without With , things like {{1,10},{2,20}}[[1,1]] will remain in the returned function, which seems a bit wasteful.

By the way, I decided to call it stepify, since it turns a list of points into a step function.

+2
source

Source: https://habr.com/ru/post/893740/


All Articles