Optimal Multiple Returns in Science Python

Question

Optimal Multiple Returns in Science Python

I use scipy / numpy for research code instead of matlab. There is one drawback I often encountered. I found a solution to work, but I want to check out the best practice and the best solution. Imagine some mathematical optimization:

def calculation (data, max_it=10000, tol = 1e-5): k = 0 rmse = np.inf while k < max_it and rmse > tol: #calc and modify data - rmse becomes smaller in each iteration k += 1 return data

It works great, I paste it into my code in several places, for example:

  import module d = module.calculation (data)

But sometimes I want to check additional information and require a few return values. If I just add some return values, I have to change another code and unzip the first return value. This is one of the few situations where I prefer matlab to scipy . Matlab only evaluates the first return value unless you explicitly require the rest.

Thus, my work for multiple return values of type matlab-like (= optim) are global variables [of the module]

 def calculation (data, max_it=10000, tol = 1e-5): global k global rmse k = 0 rmse = np.inf while k < max_it and rmse > tol: #calc and modify data - rmse becomes smaller in each iteration k += 1 return data

My functions work unchanged, and if I want to test something in ipython, Iset some global reload (module) variables and test the understanding using module.rmse.

But I could also introduce OO-aproach from the start or use pdb or use other ipython magic

+4

python numpy scipy ipython

user421929 Jul 15 '13 at 10:10

source share

1 answer

unutbu · Accepted Answer · 2013-07-15T10:17:09+0000

You can specify that you want more information using the info=True argument when calling calculation . This is the approach used by np.unique (with return_inverse and return_index ) and scipy.optimize.leastsq (with full_output parameter):

 def calculation(data, max_it=10000, tol = 1e-5, info=False): k = 0 rmse = np.inf while k < max_it and rmse > tol: #calc and modify data - rmse becomes smaller in each iteration k += 1 if info: return data, k, rmse else: return data

Or you can assign additional attributes to the calculation function:

 def calculation(data, max_it=10000, tol = 1e-5): k = 0 rmse = np.inf while k < max_it and rmse > tol: #calc and modify data - rmse becomes smaller in each iteration k += 1 calculation.k = k calculation.rmse = rmse return data

The added information will then be available using

 import module d = module.calculation(data) rmse = module.calculation.rmse

Note that this last approach will not work if calculation started from multiple threads at the same time ...

In CPython (due to GIL), only one thread can be executed at any given time, so when you run calculation in several threads, there is little appeal. But who knows? a situation may arise that requires some use of streams on a small scale, for example, in a graphical interface. There access to calculation.k or calculation.rmse may return incorrect values.

Furthermore, Zen of Python says: "Explicit is better than implicit."

Therefore, I would recommend the first approach in the second.

Optimal Multiple Returns in Science Python

More articles: