Python heapq versus complex complexity and performance

I am relatively new to python (using v3.x syntax) and would be grateful for the notes on the complexity and performance of heapq vs. sorted.

I have already implemented a heapq-based solution for the greedy "find the best work schedule" algorithm. But then I found out about the possibility of using "sorted" along with operator.itemgetter () and reverse = True.

Unfortunately, I could not find an explanation of the expected complexity and / or performance of the "sorted" compared to heapq.

+6
source share
2 answers

If you use a binary heap to put all the elements in order, then you do basically heapsort . It is slower than sorting algorightm into a sorted function , except that the implementation is pure python.

heapq faster than sorted if you need to add elements on the fly, i.e. additions and insertions may appear in an unspecified order. Adding a new element that preserves the internal order in any heap is faster than accessing the array after each insertion.

sorted faster if you need to restore all items in order later.

The only problem they can compete with is if you need some of the smallest (or largest) items from the collection. Although there are special algorithms for this case , whether heapq or sorted will work faster depends on the size of the source array and the part that you need to extract.

+6
source

heapq is implemented as a binary heap, The key points to consider in binary heaps, and extensions, heapq :

  • Search not supported
  • Insertions are a constant time on average
  • Exceptions are O (log n) time average

Further information on the binary heap described here: http://en.wikipedia.org/wiki/Binary_heap

Although heapq is a data structure that has binary heap properties, using sorted is a different concept. sorted returns a sorted list, so essentially the result, whereas heapq is a data structure that you constantly work with, which you can optionally sort with sorted .

Addtonal sorted info here: https://docs.python.org/3.4/library/functions.html#sorted

What exactly are you trying to accomplish?

Reply to OP comment:

Why do you think you need heapq specifically? A binary heap is a specialized data structure, and depending on your requirements, it is most likely not needed.

You seem very concerned about the work, but it is not clear why. If something is a "bad performer", but its total time does not matter, then in fact it does not matter in the larger picture. In general, a dict or list will execute, as a rule, perfectly. Why do you specifically think heapq needed?

I wonder if this is not a "not-ideal-be-adversary" situation.

Writing Python using C-extensions is a random use case reserved for cases where performance is a really important issue. (that is, it might be better to use, say, an XML parser, which is an extension of C, than something that is pure Python if you are dealing with large files and if performance is your main concern).

Relatively In the complex, continue to play with the structural case: were you unable to sort the sort and add elements through .append () :

I still do not understand what is used here. As I mentioned above, sorted and heapq really are two different concepts.

What is the use case for which you are so concerned about performance? (The absence of other factors that are not yet specified, I think that you can overemphasize the importance of the best performance of your code here.)

0
source

Source: https://habr.com/ru/post/972005/


All Articles