How to efficiently cut binary data in Ruby?

After looking at SO post Ruby: Split binary data , I used the following code that works.

z = 'A' * 1_000_000
z.bytes.each_slice( STREAMING_CHUNK_SIZE ).each do | chunk | 
  c = chunk.pack( 'C*' )
end

However, it is very slow:

Benchmark.realtime do
  ...
=> 0.0983949700021185

98 ms cut and pack a 1 MB file. It is very slow.

Usage example:
The server receives binary data from an external API and passes it using socket.write chunk.pack( 'C*' ).
Data is expected to be between 50 KB and 5 MB, an average of 500 KB.

So how to efficiently cut binary data in Ruby?

+4
source share
2 answers

Notes

Your code looks good, uses the correct Ruby methods and the correct syntax, but it is still:

  • creates a huge array of integers

Alternative

, :

def get_binary_chunks(string, size)
  Array.new(((string.length + size - 1) / size)) { |i| string.byteslice(i * size, size) }
end

(string.length + size - 1) / size) , , size.

500kB pdf 12345 Fruity :

Running each test 16 times. Test will take about 28 seconds.
_eric_duminil is faster than _b_seven by 380x ยฑ 100.0

get_binary_chunks 6 , StringIO#each(n).

, ( UTF8 'รค'), slice byteslice:

def get_binary_chunks(string, size)
  Array.new(((string.length + size - 1) / size)) { |i| string.slice(i * size, size) }
end

( 500x ).

Unicode, size , size .

, , :

def send_binary_chunks(socket, string, size)
  ((string.length + size - 1) / size).times do |i|
    socket.write string.slice(i * size, size)
  end
end
+5

StringIO#each(n) BINARY :

require 'stringio'
string.force_encoding(Encoding::BINARY)
StringIO.new(string).each(size) { |chunk| socket.write(chunk) }

, .

+3

Source: https://habr.com/ru/post/1665543/


All Articles