Why use an array to implement a “list” instead of a hash table?

Question

Why use an array to implement a “list” instead of a hash table?

Consider an array against a hash table, where the keys are just integral indexes on the list.

Their average values for inserting, searching, and deleting large O values are all O(1) constant time. I understand that you can get some low-level wins in the cache locale with an array, and there are marginal (mostly constant) overheads for hash table operations, but hashtables give you sparseness for free, which is a big win in some applications.

What other significant (or small) contrasts do I miss?

Context: I sometimes discuss this when I interview programmers. Usually the context is "how would you implement a Javascript array type inside a JS virtual machine?" For tightly packed data, I am a proponent with my own array, but I want to have more reasonable arguments than intuition, which "seems less like redundant."

+4

language-agnostic arrays list hashtable data-structures

Ben zotto Feb 15 '10 at 16:29

source share

4 answers

When you look at it from the perspective of someone who wants to implement the behavior of a Javascript pseudo-array, you are right that a hash table is the best way to do this, especially. since Javascript arrays do not have a fixed length and should be able to place entries at any index. Arrays in Javascript just look like arrays, but behave more like hashtables.

But in a language that is a little closer to the machine, the performance advantages and the space of using a real array for data that can be effectively stored in the array are quite remarkable, especially since the advantages of using hash tables for this are quite limited by sparse arrays, which is not what you should use an array for. This is really best done with hashtables with integer keys.

Inserting, searching, and deleting is also O (1) for arrays in all cases, but has a much smaller O constant than hash tables (this is not only due to the locality of the cache). And arrays require less space for each record. If you want to delete and insert records so that the following records change their index accordingly, this would be O (n), where n would be the number of records to be moved, but it would also be O (n) for hashtables for this and again with much higher fixed overheads. This is the operation for which you are better off using a linked list. In addition, array growth is less expensive than hash table growth, which may need to rephrase all entries.

All different types of collections have their own particular advantages and disadvantages. That is why there are so many.

+2

x4u Feb 16 '10 at 16:37

source share

Since the list is usually ordered, but the hash table is not. In the context, when you add items to the list and expect the ordering to remain consistent, the hash table gives no guarantees as to what order you will receive, and the array will keep order.

+1

tvanfosson Feb 15 '10 at 16:32

source share

Because hash functions are not free. Linear factors are important. Worst times are important. Count the instructions.

In the specific case that you are quoting, which is the main implementation of Javascript, there may be so many other overheads to erase these problems. However, if someone is trying to do something mathematical that really hits hard on the array using simple numeric keys, the array should be better.

0

bmargulies Feb 15 '10 at 16:45

source share

Ritsaert hornstra · Accepted Answer · 2010-02-16T10:31:49+0000

An array is a special case of a hash table where the hash function is very simple.

 f(x) := x;

and used modulo the same as the size of the data word (and therefore the size of the array).

If you do not allow non-unique values, you do not need the "next" pointers and voila, we have an array.

Due to the lack of a complex hash function and modular computation, this is very fast, but only applicable when the array can be kept small (very large arrays with many empty spaces that allocate memory resources and can cause unpleasant things, such as sharing / trashing to disk )

Why use an array to implement a “list” instead of a hash table?

More articles: