Performance anomaly in ruby ​​Set.include? with symbols (2.2.2 vs 2.1.6)

When testing some code, to find out if using a set is really faster than an array when checking elements included via include? I found some performance anomaly regarding strings and characters inside the collection.

First up is the script that I used for benchmarking. It basically creates an array containing 50 random 50 character strings, gets a sample of 20, and checks to see if all sample values ​​are included. The same data is used to create a set of strings, an array of characters, and a set of characters.

require 'benchmark/ips'
require 'Set'

collection_size = 50
element_length = 50
sample_size = 20

Benchmark.ips do |x|

  array_of_strings = begin
    (1..collection_size).map {|pos| (0..element_length).map { ('a'..'z').to_a[rand(26)] }.join }
  end
  array_of_symbols = array_of_strings.map(&:to_sym)
  set_of_strings = Set.new(array_of_strings)
  set_of_symbols = Set.new(array_of_symbols)

  sample_of_strings = array_of_strings.sample(sample_size)
  sample_of_symbols = array_of_symbols.sample(sample_size)

  x.report("array_of_strings: #{collection_size} elements with length #{element_length}, sample size #{sample_of_strings.length}") {
    sample_of_strings.each do |s|
      array_of_strings.include? s
    end
  }

  x.report("set_of_strings: #{collection_size} elements with length #{element_length}, sample size #{sample_of_strings.length}") {
    sample_of_strings.each do |s|
      set_of_strings.include? s
    end
  }

  x.report("array_of_symbols: #{collection_size} elements with length #{element_length}, sample size #{sample_of_symbols.length}") {
    sample_of_symbols.each do |s|
      array_of_symbols.include? s
    end
  }

  x.report("set_of_symbols: #{collection_size} elements with length #{element_length}, sample size #{sample_of_symbols.length}") {
    sample_of_symbols.each do |s|
      set_of_symbols.include? s
    end
  }

  x.compare!  
end

The test system is the 2011 MacBook Pro running on OSX 10.10.4, and the ruby ​​version was installed using rvm 1.26.11.

ruby ​​2.2.2 :

set_of_strings:      145878.6 i/s
set_of_symbols:      100100.1 i/s - 1.46x slower
array_of_symbols:    81680.0 i/s - 1.79x slower
array_of_strings:    43545.9 i/s - 3.35x slower

, , , , . , , , , , , . script , .

, script ruby ​​2.1.6 :

set_of_symbols:      202362.3 i/s
set_of_strings:      145844.1 i/s - 1.39x slower
array_of_symbols:    39158.1 i/s - 5.17x slower
array_of_strings:    24687.8 i/s - 8.20x slower

, , , ruby ​​2.2.2, , 2.1.6 .

- . , , 2.2.2 2.1.6, . , , 2.1.6. 2.2.2!

script, . i/s 2.1.6 2.2.2, 2.2.2 .

  • - ?
  • -, script?
  • , ruby?

1:

, Hash, Set. 1 1000 / Hash [k] .

Ruby 2.2.2:

h_string: 1000 keys, sample size 200:    29374.4 i/s
h_symbol: 1000 keys, sample size 200:    10604.7 i/s - 2.77x slower

Ruby 2.1.6.:

h_symbol: 1000 keys, sample size 200:    31561.9 i/s
h_string: 1000 keys, sample size 200:    25589.7 i/s - 1.23x slower

- 2.2.2 , script:

require 'benchmark/ips'

collection_size = 1000
sample_size = 200

Benchmark.ips do |x|

  h_string = Hash.new 
  h_symbol = Hash.new

  (1..collection_size).each {|k| h_string[k.to_s] = 1}
  (1..collection_size).each {|k| h_symbol[k.to_s.to_sym] = 1}

  sample_of_string_keys = h_string.keys.sample(sample_size)
  sample_of_symbol_keys = sample_of_string_keys.map(&:to_sym)

  x.report("h_string: #{collection_size} keys, sample size #{sample_of_string_keys.length}") {
    sample_of_string_keys.each do |s|
      h_string[s]
    end
  }

  x.report("h_symbol: #{collection_size} keys, sample size #{sample_of_symbol_keys.length}") {
    sample_of_symbol_keys.each do |s|
      h_symbol[s]
    end
  }

  x.compare!  
end

2:

ruby 2.3.0dev (2015-07-26 trunk 51391) [x86_64-darwin14], , collection_size sample_size ruby ​​2.1.6

, 10000 100 , 2.1.6 ( 3 , 2.2.2). , , , , .

3:

comment by @cremno 2.2 2.2 , 2.1.6

  • , , . , 50 -, .
  • ruby ​​2.3 backported code, , 2.2.3, ,
  • , " " "Set.include? , Array.include? '
  • Symbol , .
+4

Source: https://habr.com/ru/post/1599679/


All Articles