The most efficient way to find the minimum float in a python list

A quick question that is more efficient for finding the smallest number (float) in a long list (10000+ items)

this is

min(mylist) 

or

 mylist.sort() 

and then returning

 mylist[0] 

or something else...

thanks!

+6
source share
3 answers

If the list is already populated, min() is the most efficient way.

In special scenarios, you can use several tricks:

  • If you create a list from scratch, just store the smallest element in an external variable so that the answer is specified in O(1) .
  • If only Float is listed, use Array , which gives the best performance.
  • You can keep the list sorted using bisect .
  • Use Python Heap , which even has an efficient min() implementation. Make sure you understand the effects, mostly slower. ( credit: interjay )
+10
source

Frst, if you care about performance in Python (which is not always reasonable, but you need to talk about this differently), you should use the timeit module . Even in C, it's hard to predict how some functions will behave after compilation, and it's harder in Python. People often confidently express which features are faster, depending on the data. Then - using timeit, I mean - you could recognize yourself.

Secondly, if you really care about performance in float lists, you should not use lists at all, but numpy arrays. Using IPython here, under Python 2.7.2, which makes synchronization easier:

 In [41]: import random, numpy In [42]: a = [0.1*i for i in range(10**5)] In [43]: timeit min(a) 100 loops, best of 3: 4.55 ms per loop In [44]: timeit sorted(a)[0] 100 loops, best of 3: 4.57 ms per loop In [45]: random.shuffle(a) In [46]: timeit min(a) 100 loops, best of 3: 6.06 ms per loop In [47]: timeit min(a) # to make sure it wasn't a fluke 100 loops, best of 3: 6.07 ms per loop In [48]: timeit sorted(a)[0] 10 loops, best of 3: 65.9 ms per loop In [49]: b = numpy.array(a) In [50]: timeit b.min() 10000 loops, best of 3: 97.5 us per loop 

And we will note a few things. (1) Python sorting (timsort) works very well on data that sorted runs, so sorting an already sorted list has virtually no penalty. (2) Sorting a random list, on the other hand, is much slower, and this will only get worse as the data grows. (3) Numpy.min () in a float array is sixty times faster than min in a Python list, because it does not have to be shared.

+11
source

Well, you definitely need to iterate over the entire list, so the runtime should be O (n).

As sorting, O (n log n) time is already running, this is obviously not the best way to do this. I do not know the implementation of min (), but this should be the right way to do this.

+2
source

Source: https://habr.com/ru/post/909181/


All Articles