Good practice using collections. The caster?

I like to use collections.OrderedDict sometimes when I need an associative array where the order of the keys needs to be kept. The best example I have is to parse or create csv files, where it is useful to have a column order implicitly stored in an object.

But I'm worried that this is bad practice, since it seems to me that the whole concept of the associative array is that the order of the keys should never , and that any operations that rely on the order, you just need to use lists, because the list why exists (this can be done for example csv above). I have no data on this, but I bet that performance for lists is universally better than OrderedDict.

So my question is: are there any really convincing use cases for OrderedDict? Is the csv usage example a good example of where it should be used or bad?

+6
source share
5 answers

For your specific use case (writing csv files), an ordered dictation is not required. Use DictWriter .

Personally, I use OrderedDict when I need access to LIFO / FIFO, for which there is even a popitem method. I honestly could not come up with a good use case, but the one mentioned in PEP-0327 for attribute order is good:

XML / HTML processing libraries are currently missing order attributes, use a list instead of a dict, which makes filtering cumbersome or implement your own ordered dictionary. This affects ElementTree, html5lib, Genshi and many other libraries.

If you have ever wondered why there is a feature in Python, PEP is a good place to start, because where the rationale that leads to enabling this feature is described in detail.

+2
source

But I'm worried this is bad practice, since it seems to me that the whole concept of an associative array is that the order of the keys never matters,

Nonsense. This is not a "whole concept of an associative array." It's just that order rarely matters, and so, by default, we refuse an order to get a conceptually simpler (and more efficient) data structure.

and that any operations that depend on ordering should just use lists, because therefore lists exist

Stop it right now! Think a second. How would you use lists? How to list pairs (key, value), with unique keys, right? Well congratulations, my friend, you just came up with OrderedDict, just with a terrible API and really slow. Any conceptual objection to an ordered display will apply to this special data structure. Fortunately, these objections are pointless. Ordered mappings are great, they just differ from unordered mappings. Providing them with a targeted implementation with a good API, and good performance improves people's code.

In addition: Lists are just one kind of ordered data structure. And although they are somewhat universal in that you can practically all data structures from some combination of lists (if you lean back), this does not mean that you should always use lists.

I have no data on this, but I bet that performance for lists is universally better than OrderedDict.

Data (structures) do not have performance (do not). Operations on data (structures). And therefore, it depends on what operations you are interested in. If you just need a list of pairs, the list is obviously correct, and repeating it or indexing it is pretty effective. However, if you want an orderly matching, or even a tiny subset of matching functions (for example, processing duplicate keys), then the list one is pretty terrible, as I explained above.

+6
source

A comment is likely to be sufficient ...

I think it would be doubtful if you would use it in those places where you do not need it (where the order does not matter, and a regular voice recorder will be sufficient). Otherwise, the code is likely to be simpler than using lists.

This is true for any language construct / library - if it simplifies your code, use a higher-level abstraction / implementation.

0
source

As long as you feel comfortable with this data structure and that it fits your needs, why care? It may not be more efficient (in terms of speed, etc.), but if it is there, it is obviously because it is useful in some cases (or no one would have thought to write it).

In Python, you can use three types of associative arrays:

  • classic hash table (no order)
  • OrderedDict (an order that reflects the way an object is created)
  • and binary trees are not in the standard lib library, which arrange their keys exactly the way you want, in user order (not necessarily in alphabetical order).

Thus, in fact, the order of the keys may matter. Just choose a structure that you think is more suitable for the job.

0
source

For CSV and similar duplicate key constructs, namedtuple is used. This is the best of both worlds.

-1
source

Source: https://habr.com/ru/post/948431/


All Articles