Retrieving keys from a hash table sorted by value efficiently

I use Emacs Lisp, but the cl package is loaded for some common lisp functions.

I have a hash table containing up to 50K entries, with integer keys mapped to triplets, something like this (but in the actual lisp):

 { 8 => '(9 300 12) 27 => '(5 125 9) 100 => '(10 242 14) } 

The second value in the triplet is the estimate, which was calculated during the complex algorithm that built the hash table. I need to compile a regular lisp list with all the keys from the hash sorted by count (i.e., all keys sorted by frame values).

So, for the above, I need this list:

 '(27 100 8) 

I am currently doing this with two steps that feel less effective than it should be.

Is there a good way to do this?

My current solution uses maphash to collect keys and values ​​into two new lists, and then does sort usual way, referencing the list of ratings in the predicate. Looks like I could combine collecting and sorting together.

EDIT | I am also not tied to using a hash table, although I need a constant access time for whole keys that do not have linear diversity.

EDIT 2 | It looks like the binary tree sort implementation may work, where the labels in the tree are estimates and the values ​​are keys ... so I do the sorting when I iterate over the hash.

... Continues reading wikipedia page on sorting algorithms

+4
source share
1 answer

In principle, you correctly decided: you need to collect the keys in a list:

 (defun hash-table-keys (hash-table) (let ((keys ())) (maphash (lambda (kv) (push k keys)) hash-table) keys)) 

and then sort the list:

 (sort (hash-table-keys hash-table) (lambda (k1 k2) (< (second (gethash k1 hash-table)) (second (gethash k2 hash-table))))) 

Combining a collection of keys with sorting is possible: you need to collect the keys in a tree and then "smooth" the tree. However, this will only matter if you are dealing with really huge tables. Also, since Emacs Lisp compiles to bytecodes, you may find that using the built-in sort even faster than using a tree. Consider also the cost of development - you will need to write code, the cost of which will be mostly educational.

To do deeper, collecting keys, selects a list of keys (which you will definitely need in any case for the result), and sort works "in place", so the "easy way" is about as good as it is.

The "tree" method will allocate a tree (the same amount of memory as the required list of keys), and filling and smoothing is the same O(n*log(n)) process as the "collect + sort" method. However, keeping the tree balanced and then smoothing it in place (that is, without highlighting a new list) is not a simple exercise.

Bottom line: KISS .

+6
source

Source: https://habr.com/ru/post/1485809/


All Articles