Convert array to index hash in Ruby

I have an array and I want to make a hash so that I can quickly ask "is there X in the array?".

In perl, there is a simple (and quick) way to do this:

my @array = qw( 1 2 3 ); my %hash; @hash{@array} = undef; 

This generates a hash that looks like this:

 { 1 => undef, 2 => undef, 3 => undef, } 

The best I've come up with in Ruby is:

 array = [1, 2, 3] hash = Hash[array.map {|x| [x, nil]}] 

which gives:

 {1=>nil, 2=>nil, 3=>nil} 

Is there a better Ruby way?

EDIT 1

No, Array.include? this is not a good idea. Its slow . It executes the query in O (n) instead of O (1). In my array of examples there were three elements for brevity; suppose the actual one has a million elements. Let me do a little benchmarking:

 #!/usr/bin/ruby -w require 'benchmark' array = (1..1_000_000).to_a hash = Hash[array.map {|x| [x, nil]}] Benchmark.bm(15) do |x| x.report("Array.include?") { 1000.times { array.include?(500_000) } } x.report("Hash.include?") { 1000.times { hash.include?(500_000) } } end 

It produces:

  user system total real Array.include? 46.190000 0.160000 46.350000 ( 46.593477) Hash.include? 0.000000 0.000000 0.000000 ( 0.000523) 
+41
arrays ruby hash
Jan 04 '09 at 7:43
source share
14 answers

If you need a hash for membership, use Set :

Set

Set implements a set of unordered values ​​without duplicates. This is a hybrid of intuitive Array interaction tools and a quick hash search.

Set is easy to use with Enumerable objects (implementation of each ). Most initialization methods and binary operators accept generic Enumerable objects, except sets and arrays. An enumerable object can be converted to Set using to_set .

Set uses the Hash as storage, so you should note the following points:

  • Is element equality defined according to Object#eql? and Object#hash .
  • Set assumes that the identity of each element does not change when saved. Changing a set item will result in an untrusted state.
  • When a row is to be saved, a frozen copy of the row is saved instead, if the original row is not already frozen.

Comparison

The comparison operators < , > , <= and >= implemented as shorthand for the {proper _,} {subset ?, superset?} Methods. However, the operator <=> intentionally excluded, since not every pair of sets is comparable. ({x, y} versus {x, z}, for example)

Example

 require 'set' s1 = Set.new [1, 2] # -> #<Set: {1, 2}> s2 = [1, 2].to_set # -> #<Set: {1, 2}> s1 == s2 # -> true s1.add("foo") # -> #<Set: {1, 2, "foo"}> s1.merge([2, 6]) # -> #<Set: {1, 2, "foo", 6}> s1.subset? s2 # -> false s2.subset? s1 # -> true 

[...]

General class methods

new (enum = nil)

Creates a new set containing the elements of this enumerated object.

If a block is specified, enumeration elements are pre-processed by this block.

+43
Jan 04 '09 at 15:45
source share

try the following:

 a=[1,2,3] Hash[a.zip] 
+18
Jan 06 '13 at
source share

You can do this very convenient trick:

 Hash[*[1, 2, 3, 4].map {|k| [k, nil]}.flatten] => {1=>nil, 2=>nil, 3=>nil, 4=>nil} 
+14
Aug 02 '12 at 7:27
source share

If you want to quickly ask, "is there X in the array?" should you use Array#include? .

Edit (in response to adding to the OP):

If you want to quickly find the time, use Set. Having a hash that points to all nil is stupid. Conversion is a simple process with Array#to_set .

 require 'benchmark' require 'set' array = (1..1_000_000).to_a set = array.to_set Benchmark.bm(15) do |x| x.report("Array.include?") { 1000.times { array.include?(500_000) } } x.report("Set.include?") { 1000.times { set.include?(500_000) } } end 

Results on my machine:

  user system total real Array.include? 36.200000 0.140000 36.340000 ( 36.740605) Set.include? 0.000000 0.000000 0.000000 ( 0.000515) 

You should just use a set to start, instead of an array, so that conversion is never necessary.

+9
Jan 04 '09 at 8:08
source share

I am absolutely sure that there is no one-time smart way to build this hash. My tendency was to just be explicit and state what I am doing:

 hash = {} array.each{|x| hash[x] = nil} 

It does not look very elegant, but it is clear and does the job.

FWIW, your initial suggestion (at least Ruby 1.8.6) does not seem to work. I get the error "ArgumentError: odd number of arguments for Hash". Hash. [] A literal, even deferred list of values ​​is expected:

 Hash[a, 1, b, 2] # => {a => 1, b => 2} 

so I tried changing my code to:

 hash = Hash[*array.map {|x| [x, nil]}.flatten] 

but the performance is very heavy:

 #!/usr/bin/ruby -w require 'benchmark' array = (1..100_000).to_a Benchmark.bm(15) do |x| x.report("assignment loop") {hash = {}; array.each{|e| hash[e] = nil}} x.report("hash constructor") {hash = Hash[*array.map {|e| [e, nil]}.flatten]} end 

gives

  user system total real assignment loop 0.440000 0.200000 0.640000 ( 0.657287) hash constructor 4.440000 0.250000 4.690000 ( 4.758663) 

If I'm missing something here, a simple assignment loop seems to be the clearest and most effective way to create this hash.

+6
Jan 04 '09 at 14:16
source share

The champion defeated me. Dialing may be the answer.

You can do:

 require 'set' set = array.to_set set.include?(x) 
+5
Jan 04 '09 at 16:01
source share

Your way to create a hash looks good. I had nastiness in irb and this is another way

 >> [1,2,3,4].inject(Hash.new) { |h,i| {i => nil}.merge(h) } => {1=>nil, 2=>nil, 3=>nil, 4=>nil} 
+4
Jan 04 '09 at 9:24
source share

I think chrismear points to the use of assignment over creation. To make it all a little more ruby, I could suggest assigning each element something other than nil :

 hash = {} array.each { |x| hash[x] = 1 } # or true or something else "truthy" ... if hash[376] # instead of if hash.has_key?(376) ... end 

The problem with nil assignment is that you should use has_key? instead of [] , since [] gives you nil (your token value) if Hash does not have the specified key. You can work around this using a different default value, but why do the extra work?

 # much less elegant than above: hash = Hash.new(42) array.each { |x| hash[x] = nil } ... unless hash[376] ... end 
+2
Jan 04 '09 at 15:24
source share

Perhaps I misunderstand the purpose here; If you want to know if there is X in an array, why not make array.include? ("X")?

+1
Jan 04 '09 at 8:08
source share

Doing some benchmarking on offers so far suggests that the hash layout and creating the Gaius-based bindings is slightly faster than my map method (and assigning nil is slightly faster than assigning true). mtyaka and rampion. Offer 35% slower to create.

Regarding search queries, hash.include?(x) is a very small amount faster than hash[x] ; both are twice as fast as set.include?(x) .

  user system total real chrismear 6.050000 0.850000 6.900000 ( 6.959355) derobert 6.010000 1.060000 7.070000 ( 7.113237) Gaius 6.210000 0.810000 7.020000 ( 7.049815) mtyaka 8.750000 1.190000 9.940000 ( 9.967548) rampion 8.700000 1.210000 9.910000 ( 9.962281) user system total real times 10.880000 0.000000 10.880000 ( 10.921315) set 93.030000 17.490000 110.520000 (110.817044) hash-i 45.820000 8.040000 53.860000 ( 53.981141) hash-e 47.070000 8.280000 55.350000 ( 55.487760) 

Benchmarking Code:

 #!/usr/bin/ruby -w require 'benchmark' require 'set' array = (1..5_000_000).to_a Benchmark.bmbm(10) do |bm| bm.report('chrismear') { hash = {}; array.each{|x| hash[x] = nil} } bm.report('derobert') { hash = Hash[array.map {|x| [x, nil]}] } bm.report('Gaius') { hash = {}; array.each{|x| hash[x] = true} } bm.report('mtyaka') { set = array.to_set } bm.report('rampion') { set = Set.new(array) } end hash = Hash[array.map {|x| [x, true]}] set = array.to_set array = nil GC.start GC.disable Benchmark.bmbm(10) do |bm| bm.report('times') { 100_000_000.times { } } bm.report('set') { 100_000_000.times { set.include?(500_000) } } bm.report('hash-i') { 100_000_000.times { hash.include?(500_000) } } bm.report('hash-e') { 100_000_000.times { hash[500_000] } } end GC.enable 
+1
Jan 05 '09 at 8:29
source share

If you are not worried about what hash values

 irb(main):031:0> a=(1..1_000_000).to_a ; a.length => 1000000 irb(main):032:0> h=Hash[a.zip a] ; h.keys.length => 1000000 

It takes a second or so on my desktop.

+1
Aug 13 2018-10-14
source share

If you are looking for the equivalent of this Perl code:

 grep {$_ eq $element} @array 

You can simply use simple Ruby code:

 array.include?(element) 
0
Jan 04 '09 at 8:18
source share

Here's a neat way to cache requests using Hash:

 a = (1..1000000).to_a h = Hash.new{|hash,key| hash[key] = true if a.include? key} 

To a large extent, it creates a default constructor for new hash values, and then stores "true" in the cache if it is in an array (otherwise, otherwise). This allows lazy loading into the cache, just in case you are not using every element.

0
Jan 04 '09 at 15:46
source share

This saves 0 if your hash was [0,0,0,1,0]

  hash = {} arr.each_with_index{|el, idx| hash.merge!({(idx + 1 )=> el }) } 

Returns:

  # {1=>0, 2=>0, 3=>0, 4=>1, 5=>0} 
0
Apr 21 '14 at 12:17
source share



All Articles