Foreach on hash variables in Perl

I am new to Perl scripts and have doubts about hash variables in foreach . I want to print all the values ​​of my hash. Here is the program:

 %colors = (a => 1, b=>2, c=>3, d=>4, e=>5); foreach $colors(keys %colors) { print "$colors{%colors} \n"; } 

Output:

 5 3 1 2 4 

Why are the values ​​sorted randomly? Or what is the logic of this accident? Please clarify my doubts.

+4
source share
4 answers

I think your confusion is that you do not know what a hash is. Most languages ​​have something similar to a repository of key values; in Ruby and Perl they are called hashes, on Java maps, in Python dictionaries, etc.

All of them are essentially the same thing, you insert a value with a unique key into some basic data structure to get direct access to it at the expense of memory.

So what happens when you add a key and value to a hash?

Hashes are built around the idea of hash functions that take on some value, since input calculates a unique result (ideally, each input has its own unique output). If two inputs coincide with the same output, this is called a collision.

Now we are at the point where we need to talk about how Hash is implemented, two classic examples - with one array or an array of linked lists. I will show an example of an array below.

Array

In the case of a simple array, the data structure underlying the Hash is an array of some size. The hash function is used to calculate the index into this array. If we adopt a simple hashing algorithm

 h(x) = length(x) % ARRAY_SIZE 

here x is a string, and ARRAY_SIZE is the size of our base array, this operator will ensure that all x values ​​fall in the range 0..ARRAY_SIZE - 1

To look at a visual example, consider an array of size 5:

  0 1 2 3 4 ------------------------------ | | | | | | ------------------------------ 

and suppose we are trying to save the value 5 with the abcd key, according to our hash algorithm

 h('abcd') = length('abcd') % ARRAY_SIZE = 4 % 5 = 4 

So, the value 5 will be stored in index 4 :

  0 1 2 3 4 ------------------------------ | | | | | 5 | ------------------------------ 

Now, what happens if we try to save the value 3 with the dcba key, these two keys will be different? They should be displayed in different places.

 h('dcba') = length('dcba') % ARRAY_SIZE = 4 % 5 = 4 

Oops! This key also displays index 4 , and what will we do now? Well, we can't just throw away the key-value pair, because the programmer obviously needs / needs this to pair in their Hash, so we need to decide what to do in the event of a collision. There are many algorithms that do this, but the easiest is to search for the next open slot in the array and save 3 them. So now our array looks like this:

  0 1 2 3 4 ------------------------------ | 3 | | | | 5 | ------------------------------ 

This was not an extremely detailed explanation, but hopefully it will give some insight into why getting values ​​from hashes seems random, because the underlying data structure is constantly changing, if you were to ask for the keys to your hash right now you are likely to return (3, 5) , even if you inserted 5 in the first place, just because 3 occurs first in the array.

Hope this was helpful.

+8
source

Citation perldata - Perl data types :

Hashes are unordered collections of scalar values ​​indexed by their associated string key.

You can sort keys, or if you want to keep the order specified in the initialization, use Tie :: Hash :: Indexed or Tie :: IxHash .

+7
source

The keys description in perldoc has the following snippet:

Hash entries are returned in a random order. Actual random order is specific to this hash; the exact same series of operations on two hashes can lead to a different order for each hash. Any insertion in the hash can change the order, just like any deletion, except that the last key returned by each or the keys can be deleted without changing the order. As long as this hash is not changed, you can rely on keys, values, and each to repeatedly return the same order to each other. See Perlsec Algorithmic Complexity Attack for details on why hash order is randomized. In addition to the guarantees provided here, the exact details of the Perl hash algorithm and the hash traversal order can be changed in any release of Perl.

Perlsec says the following about hash algorithms:

  • Hash Algorithm - Hash algorithms similar to those used in Perl are known to be vulnerable to attacking a collision with their hash function. Such attacks include creating a set of keys that collide in the same bucket, resulting in inefficient behavior. Such attacks often depend on the detection of the hash function seed used to map keys to buckets. This seed is then used to force the set of keys, which can be used to set up a denial of service attack. Perl 5.8.1 was amended to simplify Perl for such attacks, and then in Perl 5.18.0 these features were extended and additional protections added. At the time of this writing, Perl 5.18.0 is considered to be well protected from attacks of algorithmic complexity in the implementation of the hash. This is largely due to the following mitigation measures:

    • Hash seed randomization
      To make it impossible to know which seed generates a set of attack keys, this seed is accidentally initialized when the process starts. This can be overridden using the PERL_HASH_SEED environment variable, see PERL_HASH_SEED in perlrun. This environment variable controls how items are stored, not how they are represented using keys , values, and each .

    • Hash Trace Randomization
      No matter which seed is used in the hash function, the keys , values, and each return elements in each hash randomized order. Changing the hash by inserting will change the iteration order of this hash. You can override this behavior with hash_traversal_mask () from Hash :: Util or with the PERL_PERTURB_KEYS environment variable, see PERL_PERTURB_KEYS in perlrun. Note that this function controls the “visible” order of the keys, and not the actual order in which they are stored.

    • Bucket order instability
      When elements collide with a given hash bucket, the order they store in the chain is no longer predictable in Perl 5.18. This intends to complicate collision monitoring. You can override this behavior using the PERL_PERTURB_KEYS environment variable, see PERL_PERTURB_KEYS in perlrun.
    • New default hash function
      The default hash function has been changed to complicate the output of the hash.
    • Alternative hash functions
      The source code includes several hash algorithms to choose from. Although we believe that the default perl hash is reliable for attack, we have included the Siphash hash function as a return option. At the time of the release of Perl 5.18.0, Siphash is believed to have cryptographic strength. This is not the default value, as it is much slower than the default hash.

Without compiling custom Perl, there is no way to get the same behavior of any version prior to Perl 5.18.0. The closest you can get is to set PERL_PERTURB_KEYS to 0 and set the value of PERL_HASH_SEED to a known value. We do not recommend these settings for use in production due to the above safety considerations.

Perl never guaranteed any order of hash keys, and the order has changed several times over the life of Perl 5. In addition, the ordering of hash keys has always been and continues to depend on the insertion order and the history of changes made to the hash over its entire life.

Also note that although the order of the hash elements can be randomized, this “pseudo-ordering” should not be used for applications such as randomly shuffling a list (use List :: Util :: shuffle () for this, see List: : Util, the standard base module with Perl 5.8.0 or the CPAN algorithm module Algorithm :: Numeric :: Random) or to generate permutations (use, for example, CPAN modules Algorithm :: Permit or Algorithm :: FastPermute) or for any cryptographic applications.

+3
source

You can use sorting to print the results in alphabetical order (since your keys are alphanumeric) sorted as follows:

 %colors = ("a" => 1, "b"=>2, "c"=>3, "d"=>4, "e"=>5); foreach (sort keys %colors) { print $colors{$_} . "\n"; } 

Alternatively, if you prefer to sort by values:

 %colors = ("a" => 1, "b"=>2, "c"=>3, "d"=>4, "e"=>5); foreach (sort { $colors{$a} <=> $colors{$b} } keys %colors) { print $colors{$_} . "\n"; } 
+2
source

Source: https://habr.com/ru/post/1493417/


All Articles