Ruby: optimize => phrase.split (delimiter) .collect {| p | p.lstrip.rstrip}

ruby: which is the most optimized expression for evaluating the same result as with

phrase.split(delimiter).collect {|p| p.lstrip.rstrip } 
+4
source share
3 answers

Optimized for clarity, I would prefer the following:

 phrase.split(delimiter).collect(&:strip) 

But I assume that you want to optimize speed. I do not know why others are speculating. The only way to find out which is faster is to compare your code.

Make sure you adjust the control parameters - this is just an example.

 require "benchmark" # Adjust parameters below for your typical use case. n = 10_000 input = " This is - an example. - A relatively long string " + "- delimited by dashes. - Adjust if necessary " * 100 delimiter = "-" Benchmark.bmbm do |bench| bench.report "collect { |s| s.lstrip.rstrip }" do # Your example. n.times { input.split(delimiter).collect { |s| s.lstrip.rstrip } } end bench.report "collect { |s| s.strip }" do # Use .strip instead of .lstrip.rstrip. n.times { input.split(delimiter).collect { |s| s.strip } } end bench.report "collect { |s| s.strip! }" do # Use .strip! to modifiy strings in-place. n.times { input.split(delimiter).collect { |s| s.strip! } } end bench.report "collect(&:strip!)" do # Slow block creation (&:strip! syntax). n.times { input.split(delimiter).collect(&:strip!) } end bench.report "split(/\\s*\#{delim}\\s*/) (static)" do # Use static regex -- only possible if delimiter doesn't change. re = Regexp.new("\s*#{delimiter}\s*") n.times { input.split(re) } end bench.report "split(/\\s*\#{delim}\\s*/) (dynamic)" do # Use dynamic regex, slower to create every time? n.times { input.split(Regexp.new("\s*#{delimiter}\s*")) } end end 

Results on my laptop with the options listed above:

  user system total real collect { |s| s.lstrip.rstrip } 7.970000 0.050000 8.020000 ( 8.246598) collect { |s| s.strip } 6.350000 0.050000 6.400000 ( 6.837892) collect { |s| s.strip! } 5.110000 0.020000 5.130000 ( 5.148050) collect(&:strip!) 5.700000 0.030000 5.730000 ( 6.010845) split(/\s*#{delim}\s*/) (static) 6.890000 0.030000 6.920000 ( 7.071058) split(/\s*#{delim}\s*/) (dynamic) 6.900000 0.020000 6.920000 ( 6.983142) 

From the above, I can conclude:

  • Using strip instead of .lstrip.rstrip is faster.
  • Preference &:strip! over { |s| s.strip! } { |s| s.strip! } { |s| s.strip! } associated with the cost of the work.
  • Simple regex patterns are almost as fast as using split + strip .

What I can think of this can affect the result:

  • The length of the delimiter (and whether it is a space).
  • The length of the lines you want to split.
  • The length of the shared fragments per line.

But do not take my word for it. Measure it!

+10
source

You can try regex:

 phrase.strip.split(/\s*#{delimiter}\s*/) 
+1
source

I see only optimization in ommiting

 p.lstrip.rstrip 

from

 p.strip! 
0
source

Source: https://habr.com/ru/post/1302917/


All Articles