Why are tuples not listed in Elixir?

I need an efficient structure for an array of thousands of elements of the same type with the ability to do random access.

While the list is most effective at iterating and adding, it is too slow for random access, so it does not fit my needs.

The card works better. Howerver causes some overhead because it is for key-value pairs where the key can be anything, while I need an array with indices from 0 to N. As a result, my application worked too slowly with cards. I think this is unacceptable overhead for such a simple task as handling ordered lists with random access.

I found that a tuple is the most efficient structure in Elixir for my task. When compared to a card on my machine, it is faster

  • at iteration - 1.02x for 1_000, 1.13x for elements 1_000_000.
  • random access - 1.68x for 1_000, 2.48x for 1_000_000
  • and when copying - 2.82x for 1_000, 6.37x for 1_000_000.

As a result, my code on tuples is 5 times faster than the same code on maps. This probably does not require an explanation of why a tuple is more efficient than a map. The goal has been achieved, but everyone says “do not use tuples for a list of similar elements”, and no one can explain this rule (an example of such cases is https://stackoverflow.com/a/168331 )

Btw, there are tuples in Python. They are also immutable, but still repeatable.

So,

1. Why are tuples not listed in Elixir? Are there any technical or logical limitations?

2. And why shouldn't they be used as lists of similar items? Are there any disadvantages?

Note: why questions, not how. The explanation above is just an example where tuples work better than lists and maps.

+6
source share
2 answers

1. The reason is not to implement Enumerable for Tuple

From the Elixir mailing list:

If there is an implementation of the protocol for the tuple, it contradicts all the entries. Given that user protocol instances are almost always specific to entries that add a tuple, the entire Enumerable protocol will make it pretty useless.

- Peter Minten

I wanted to list the tuples first, and even the "Enumerable" on them was eventually implemented, which did not work.

- Chris Keele

How does this violate the protocol? I will try to put everything together and explain the problem from a technical point of view.

Tuples. . What is interesting about tuples is that they are mainly used for duck printing , using pattern matching . You are not required to create a new module for a new struct every time you need some new simple type. Instead, you create a tuple - a kind of object of a virtual type. Atoms are often used as the first elements of type type names, for example {:ok, result} and {:error, description} . This is how tuples are used almost everywhere in Elixir because it is their design goal. They are also used as the basis for " records ", which comes from Erlang. Elixir has structures for this purpose, but also provides a Record module for Erlang compatibility. Therefore, in most cases, tuples are separate heterogeneous data structures that should not be listed. Tuples should be considered as instances of various virtual types. There is even an @type directive that allows you to define custom types based on tuples. But remember that they are virtual, and is_tuple/1 still returns true for all these tuples.

Protocols . Elixir protocols, on the other hand, are class types that provide ad hoc polymorphism . For those who come from OOP, this is something like superclasses and multiple inheritance . One important thing the protocol does for you is automatic type checking. When you pass some data to a protocol function, it checks if this class belongs to this class, that is, this protocol is implemented for this data type. If not, you will get this error:

 ** (Protocol.UndefinedError) protocol Enumerable not implemented for {} 

Thus, Elixir saves your code from stupid errors and complex errors if you are not mistaken in architectural decisions.

Generally. Now imagine that we are implementing Enumerable for Tuple. What he does is list all the tuples, while 99.9% of the tuples in Elixir are not designed for this. All checks are broken. The tragedy is the same as if all the animals in the world began to quack. If a tuple is passed to the Enum or Stream module accidentally, you will not see a useful error message. Instead, your code will produce unexpected results, unpredictable behavior, and possibly data corruption.

2. Reason not to use tuples as collections

Good, reliable Elixir code should contain typespecs , which will help developers understand the code and give Dialyzer the ability to test the code for you. Imagine you need a collection of similar elements. The typepec for lists and maps might look like this:

 @type list_of_type :: [type] @type map_of_type :: %{optional(key_type) => value_type} 

But you cannot write the same type for a tuple, because {type} means "a tuple of one element of type type ". You can write typepec for a tuple of a predetermined length, for example {type, type, type} , or for a tuple of any elements such as tuple() , but there is no way to write a type specification for a tuple of similar elements just by design. Thus, choosing tuples to store your collection of elements means that you lose such good ability to make your code reliable.

Conclusion

The rule not to use tuples as lists of similar elements is a rule that explains how in most cases to choose the correct type in Elixir. Violation of this rule may be considered as a possible signal of poor design choice. When people say that “tuples are not for design collections,” it means not just “you are doing something unusual,” but “you can break Elixir’s functions by making the design wrong in your application.”

If you really want to use the tuple as a collection for some reason, and you are sure that you know what you are doing, then it would be nice to wrap it in some kind of struct . You can implement the Enumerable protocol for your structure without the risk of breaking the entire contents of tuples. It is worth noting that Erlang uses tuples as collections for the internal representation of array , gb_trees , gb_sets , etc.

 iex(1)> :array.from_list ['a', 'b', 'c'] {:array, 3, 10, :undefined, {'a', 'b', 'c', :undefined, :undefined, :undefined, :undefined, :undefined, :undefined, :undefined}} 

Not sure if there is another technical reason not to use tuples as collections. If someone can provide another good explanation for the conflict between the Record and Enumerable protocol, they can improve that answer.

+3
source

Since you are sure that you need to use tuples, you can get the requested functionality at the expense of compilation time. The solution below will compile for a long time (consider ≈100s for @max_items 1000 ) After compilation, the runtime would please you. The same approach is used in the Elixir kernel to create modern UTF-8 string matches.

 defmodule Tuple.Enumerable do defimpl Enumerable, for: Tuple do @max_items 1000 def count(tuple), do: tuple_size(tuple) def member?(_, _), do: false # for the sake of compiling time def reduce(tuple, acc, fun), do: do_reduce(tuple, acc, fun) defp do_reduce(_, {:halt, acc}, _fun), do: {:halted, acc} defp do_reduce(tuple, {:suspend, acc}, fun) do {:suspended, acc, &do_reduce(tuple, &1, fun)} end defp do_reduce({}, {:cont, acc}, _fun), do: {:done, acc} defp do_reduce({value}, {:cont, acc}, fun) do do_reduce({}, fun.(value, acc), fun) end Enum.each( 1..@max _items-1, fn tot -> tail = Enum.join(Enum.map(1..tot, & "e_★_#{&1}"), ",") match = Enum.join(["value"] ++ [tail], ",") Code.eval_string( "defp do_reduce({#{match}}, {:cont, acc}, fun) do do_reduce({#{tail}}, fun.(value, acc), fun) end", [], __ENV__ ) end) defp do_reduce(huge, {:cont, _}, _) do raise Protocol.UndefinedError, description: "too huge #{tuple_size(huge)} > #{@max_items}", protocol: Enumerable, value: Tuple end end end Enum.each({:a, :b, :c}, fn e -> IO.puts "Iterating: #{e}" end) #⇒ Iterating: a #  Iterating: b #  Iterating: c 

The above implementation explicitly avoids the member? implementation member? since it will take even more time to compile until you have requested only iteration.

+1
source

Source: https://habr.com/ru/post/1013527/


All Articles