Is there a built-in lazy hash in Ruby?

Question

Is there a built-in lazy hash in Ruby?

I need to fill the hash with different values. Some of the values are available often enough, while others are rare.

The problem is that I use some calculations to get the values, and filling the Hash becomes very slow with a few keys.

Using my cache in my case is not an option.

I wonder how to make Hash calculate the value only when the key is first accessed, and not when it is added?

Thus, rarely used values usually slow down the filling process.

I am looking for something that is "kind of asynchronous" or lazy access.

+4

ruby

Steve katz Nov 17 '12 at 13:42

source share

4 answers

You can define your own indexer like this:

 class MyHash def initialize @cache = {} end def [](key) @cache[key] || (@cache[key] = compute(key)) end def []=(key, value) @cache[key] = value end def compute(key) @cache[key] = 1 end end

and use it as follows:

 1.9.3p286 :014 > hash = MyHash.new => #<MyHash:0x007fa0dd03a158 @cache={}> 1.9.3p286 :019 > hash["test"] => 1 1.9.3p286 :020 > hash => #<MyHash:0x007fa0dd03a158 @cache={"test"=>1}>

+2

Candleide Nov 17 '12 at 14:06

source share

you can use this:

 class LazyHash < Hash def [] key (_ = (@self||{})[key]) ? ((self[key] = _.is_a?(Proc) ? _.call : _); @self.delete(key)) : super end def lazy_update key, &proc (@self ||= {})[key] = proc self[key] = proc end end

Your lazy hash will behave like a normal Hash , because it is actually a real Hash .

Watch a live demo here

*** UPDATE - answer to the question about nested procs ***

Yes, it will work, but it is cumbersome.

See updated answer.

Use lazy_update instead of [] = to add lazy values to your hash.

+2

user904990 Nov 17 '12 at 15:04

source share

This is not a strict answer to the body of your question, but Enumerable::Lazy will definitely be part of Ruby 2.0 . This will allow you to do lazy evaluations of iterator compositions:

 lazy = [1, 2, 3].lazy.select(&:odd?) # => #<Enumerable::Lazy: #<Enumerator::Generator:0x007fdf0b864c40>:each> lazy.to_a # => [40, 50]

0

pje Nov 18 '12 at 0:39

source share

Jonathan tran · Accepted Answer · 2012-11-17T14:57:19+0000

There are many different ways to approach this. I recommend using an instance of the class that you define instead of Hash. For example, instead of ...

# Example of slow code using regular Hash. h = Hash.new h[:foo] = some_long_computation h[:bar] = another_long_computation # Access value. puts h[:foo]

... create your own class and define methods, for example ...

 class Config def foo some_long_computation end def bar another_long_computation end end config = Config.new puts config.foo

If you need an easy way to cache long calculations, or it should absolutely be a hash, not your own class, now you can wrap a Config instance with Hash.

 config = Config.new h = Hash.new {|h,k| h[k] = config.send(k) } # Access foo. puts h[:foo] puts h[:foo] # Not computed again. Cached from previous access.

One problem with the above example is that h.keys will not include :bar because you have not yet accessed it. Thus, you could not, for example, iterate over all keys or entries in h , because they do not exist until they are available. Another potential problem is that your keys must be valid Ruby identifiers, so arbitrary String keys with spaces will not work when defining them on Config .

If it matters to you, there are different ways to deal with it. One way to do this is to populate your thunks hash and force thunks to be used on access.

 class HashWithThunkValues < Hash def [](key) val = super if val.respond_to?(:call) # Force the thunk to get actual value. val = val.call # Cache the actual value so we never run long computation again. self[key] = val end val end end h = HashWithThunkValues.new # Populate hash. h[:foo] = ->{ some_long_computation } h[:bar] = ->{ another_long_computation } h["invalid Ruby name"] = ->{ a_third_computation } # Some key that an invalid ruby identifier. # Access hash. puts h[:foo] puts h[:foo] # Not computed again. Cached from previous access. puts h.keys #=> [:foo, :bar, "invalid Ruby name"]

One caveat with this last example is that it will not work if your values are callable, because it cannot distinguish between a ton that must be forced and a value.

Again, there are ways to handle this. One way to do this is to save a flag that marks whether the value has been evaluated. But for each record, additional memory is required. A better way would be to define a new class to note that the Hash value is an invaluable thunk.

 class Unevaluated < Proc end class HashWithThunkValues < Hash def [](key) val = super # Only call if it unevaluated. if val.is_a?(Unevaluated) # Force the thunk to get actual value. val = val.call # Cache the actual value so we never run long computation again. self[key] = val end val end end # Now you must populate like so. h = HashWithThunkValues.new h[:foo] = Unevaluated.new { some_long_computation } h[:bar] = Unevaluated.new { another_long_computation } h["invalid Ruby name"] = Unevaluated.new { a_third_computation } # Some key that an invalid ruby identifier. h[:some_proc] = Unevaluated.new { Proc.new {|x| x + 2 } }

The disadvantage of this is that now you have to remember that you use Unevaluted.new when filling in your hash. If you want all values to be lazy, you can also override []= . I don't think this actually saves a lot of input, because you still have to use Proc.new , proc , lambda or ->{} to create the block in the first place. But it can be helpful. If you did, it might look something like this.

 class HashWithThunkValues < Hash def []=(key, val) super(key, val.respond_to?(:call) ? Unevaluated.new(&val) : val) end end

So here is the complete code.

 class HashWithThunkValues < Hash # This can be scoped inside now since it not used publicly. class Unevaluated < Proc end def [](key) val = super # Only call if it unevaluated. if val.is_a?(Unevaluated) # Force the thunk to get actual value. val = val.call # Cache the actual value so we never run long computation again. self[key] = val end val end def []=(key, val) super(key, val.respond_to?(:call) ? Unevaluated.new(&val) : val) end end h = HashWithThunkValues.new # Populate. h[:foo] = ->{ some_long_computation } h[:bar] = ->{ another_long_computation } h["invalid Ruby name"] = ->{ a_third_computation } # Some key that an invalid ruby identifier. h[:some_proc] = ->{ Proc.new {|x| x + 2 } }

Is there a built-in lazy hash in Ruby?

Watch a live demo here

More articles: