Computing slopes in Numpy (or Scipy)

Question

Computing slopes in Numpy (or Scipy)

I am trying to find the fastest and most efficient way to calculate slopes using Numpy and Scipy. I have a dataset of three variables Y and one variable X, and I need to calculate their individual slopes. For example, I can easily do this one line at a time, as shown below, but I was hoping there was a more efficient way to do this. I also don't think linregress is the best way to go, because I don't need any helper variables like interception, standard error, etc. In my results. Any help is appreciated.

import numpy as np from scipy import stats Y = [[ 2.62710000e+11 3.14454000e+11 3.63609000e+11 4.03196000e+11 4.21725000e+11 2.86698000e+11 3.32909000e+11 4.01480000e+11 4.21215000e+11 4.81202000e+11] [ 3.11612352e+03 3.65968334e+03 4.15442691e+03 4.52470938e+03 4.65011423e+03 3.10707392e+03 3.54692896e+03 4.20656404e+03 4.34233412e+03 4.88462501e+03] [ 2.21536396e+01 2.59098311e+01 2.97401268e+01 3.04784552e+01 3.13667639e+01 2.76377113e+01 3.27846013e+01 3.73223417e+01 3.51249997e+01 4.42563658e+01]] X = [ 1990. 1991. 1992. 1993. 1994. 1995. 1996. 1997. 1998. 1999.] slope_0, intercept, r_value, p_value, std_err = stats.linregress(X, Y[0,:]) slope_1, intercept, r_value, p_value, std_err = stats.linregress(X, Y[1,:]) slope_2, intercept, r_value, p_value, std_err = stats.linregress(X, Y[2,:]) slope_0 = slope/Y[0,:][0] slope_1 = slope/Y[1,:][0] slope_2 = slope/Y[2,:][0] b, a = polyfit(X, Y[1,:], 1) slope_1_a = b/Y[1,:][0]

+13

python numpy scipy

hotshotiguana Mar 2 '12 at 18:30

source share

8 answers

The fastest and most efficient way is to use the built-in scipy function from linregress , which calculates everything:

slope: slope of the regression line
interception: regression line interception
r-value: correlation coefficient
p-value: a two-way p-value for a hypothesis test whose null hypothesis is that the slope is zero
stderr: standard estimation error

And here is an example:

 a = [15, 12, 8, 8, 7, 7, 7, 6, 5, 3] b = [10, 25, 17, 11, 13, 17, 20, 13, 9, 15] from scipy.stats import linregress linregress(a, b)

will return you:

 LinregressResult(slope=0.20833333333333337, intercept=13.375, rvalue=0.14499815458068521, pvalue=0.68940144811669501, stderr=0.50261704627083648)

PS Just a mathematical formula for tilting:

+24

Salvador dali Nov 21 '15 at 12:45

source share

A view that is simpler than the accepted answer:

 x = np.linspace(0, 10, 11) y = np.linspace(0, 20, 11) y = np.c_[y, y,y] X = x - x.mean() Y = y - y.mean() slope = (X.dot(Y)) / (X.dot(X))

The equation for tilt comes from vector notation for tilting a line using simple regression .

+7

drpm May 08 '18 at 16:58

source share

I did this using the np.diff () function:

dx = np.diff (xvals),

dy = np.diff (yvals)

slopes = dy / dx

+3

user10028580 Jul 03 '18 at 19:05

source share

As said earlier, you can use lean Linregress. Here's how to output only the slope:

  from scipy.stats import linregress x=[1,2,3,4,5] y=[2,3,8,9,22] slope, intercept, r_value, p_value, std_err = linregress(x, y) print(slope)

Keep in mind that executing this method, as you are calculating additional values such as r_value and p_value, will take longer than calculating only the slope manually. However, Linregress is pretty fast.

Source: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.linregress.html

+2

embulldogs99 Jun 13 '19 at 5:29

source share

If X and Y are defined the same way as in your question, you can use:

 dY = (numpy.roll(Y, -1, axis=1) - Y)[:,:-1] dX = (numpy.roll(X, -1, axis=0) - X)[:-1] slopes = dY/dX

numpy.roll () helps you align the next observation with the current one, you just need to remove the last column, which is not a useful difference between the last and first observation. Then you can calculate all the slopes at once, without scipy.

In your example, dX always 1, so you can save more time by calculating slopes = dY .

+1

Benjamin Mar 2 '12 at 18:50

source share

I relied on other answers and an original regression formula to build a function that works for any tensor. It calculates the slopes of the data along a given axis. So, if you have arbitrary tensors X[i,j,k,l], Y[i,j,k,l] and you want to know the slopes for all other axes along the data on the third axis, you can call this with calcSlopes( X, Y, axis = 2 ) .

 import numpy as np def calcSlopes( x = None, y = None, axis = -1 ): assert x is not None or y is not None # assume that the given single data argument are equally # spaced y-values (like in numpy plot command) if y is None: y = x x = None # move axis we wanna calc the slopes of to first # as is necessary for subtraction of the means # note that the axis 'vanishes' anyways, so we don't need to swap it back y = np.swapaxes( y, axis, 0 ) if x is not None: x = np.swapaxes( x, axis, 0 ) # https://en.wikipedia.org/wiki/Simple_linear_regression # beta = sum_i ( X_i - <X> ) ( Y_i - <Y> ) / ( sum_i ( X_i - <X> )^2 ) if x is None: # axis with values to reduce must be trailing for broadcast_to, # therefore transpose x = np.broadcast_to( np.arange( y.shape[0] ), yTshape ).T x = x - ( x.shape[0] - 1 ) / 2. # mean of (0,1,...,n-1) is n*(n-1)/2/n else: x = x - np.mean( x, axis = 0 ) y = y - np.mean( y, axis = 0 ) # beta = sum_i x_i y_i / sum_i x_i*^2 slopes = np.sum( np.multiply( x, y ), axis = 0 ) / np.sum( x**2, axis = 0 ) return slopes

He also has a trick for working with data with equally spaced data. For example:

 y = np.array( [ [ 1, 2, 3, 4 ], [ 2, 4, 6, 8 ] ] ) print( calcSlopes( y, axis = 0 ) ) print( calcSlopes( y, axis = 1 ) ) x = np.array( [ [ 0, 2, 4, 6 ], [ 0, 4, 8, 12 ] ] ) print( calcSlopes( x, y, axis = 1 ) )

Exit:

 [1. 2. 3. 4.] [1. 2.] [0.5 0.5]

0

mxmlnkn Jun 16 '19 at 14:44

source share

This understandable single-line text should be quite effective without boring:

 slope = np.polyfit(X,Y,1)[0]

Finally you should get

 import numpy as np Y = np.array([ [ 2.62710000e+11, 3.14454000e+11, 3.63609000e+11, 4.03196000e+11, 4.21725000e+11, 2.86698000e+11, 3.32909000e+11, 4.01480000e+11, 4.21215000e+11, 4.81202000e+11], [ 3.11612352e+03, 3.65968334e+03, 4.15442691e+03, 4.52470938e+03, 4.65011423e+03, 3.10707392e+03, 3.54692896e+03, 4.20656404e+03, 4.34233412e+03, 4.88462501e+03], [ 2.21536396e+01, 2.59098311e+01, 2.97401268e+01, 3.04784552e+01, 3.13667639e+01, 2.76377113e+01, 3.27846013e+01, 3.73223417e+01, 3.51249997e+01, 4.42563658e+01]]).T X = [ 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999] print np.polyfit(X,Y,1)[0]

The output value is [1.54983152e + 10 9.98749876e + 01 1.84564349e + 00]

0

kimstik 20 sept '19 at 17:34

source share

Brian b · Accepted Answer · 2012-03-02T19:06:18+0000

Linear regression calculation in one dimension vector calculation . This means that we can combine multiplications on the entire Y matrix and then vectorize the fit using the axis parameter in numpy. In your case, it will look as follows

 ((X*Y).mean(axis=1) - X.mean()*Y.mean(axis=1)) / ((X**2).mean() - (X.mean())**2)

You do not need quality parameters, but most of them can be obtained in a similar way.

Computing slopes in Numpy (or Scipy)

More articles: