Digest :: CRC32 with Zlib

In my code, I need hash files using a variety of algorithms, including CRC32. Since I also use other cryptographic hash functions in the Digest family, I thought it would be nice to maintain a consistent interface for all of them.

For the record, I found digest-crc that does exactly what I want. The fact is that Zlib is part of the standard library and has a working implementation of CRC32, which I would like to reuse. It is also written in C, so it should offer superior performance with respect to digest-crc , which is a pure ruby ​​implementation.

The Digest::CRC32 implementation actually looked pretty simple:

 %w(digest zlib).each { |f| require f } class Digest::CRC32 < Digest::Class include Digest::Instance def update(str) @crc32 = Zlib.crc32(str, @crc32) end def initialize; reset; end def reset; @crc32 = 0; end def finish; @crc32.to_s; end end 

Everything looks right:

 crc32 = File.open('Rakefile') { |f| Zlib.crc32 f.read } digest = Digest::CRC32.file('Rakefile').digest!.to_i crc32 == digest => true 

Unfortunately, not everything works:

 Digest::CRC32.file('Rakefile').hexdigest! => "313635393830353832" # What I actually expected was: Digest::CRC32.file('Rakefile').digest!.to_i.to_s(16) => "9e4a9a6" 

hexdigest basically returns Digest.hexencode(digest) , which works with a digest value at the byte level . I'm not sure how this function works, so I was wondering if this can only be achieved with the integer returned from Zlib.crc32 .

+6
source share
3 answers

Digest expects the digest to return the raw bytes that make up the checksum, i.e. in the case of crc32, 4 bytes, which is a 32-bit integer. However, instead, you return a string containing the base 10 of that integer.

Do you want something like

 [@crc32].pack('V') 

to turn this integer into bytes that it represents. Come and read in the package and its various format specifiers - there are many ways to pack an integer depending on whether the bytes should be represented in the native ensin, big-endian, little-endian, etc. Therefore, you should find out which one fits your needs

+6
source

Sorry, this does not answer your question, but may help.

First, when reading in a file, make sure that you pass the "rb" parameter. I see that you are not on windows, but if by chance your code ends up running on a Windows machine, your code will not work that way, especially when reading ruby ​​files. Example:

 crc32 = File.open('test.rb') { |f| Zlib.crc32 f.read } #=> 189072290 digest = Digest::CRC32.file('test.rb').digest!.to_i #=> 314435800 crc32 == digest #=> false crc32 = File.open('test.rb', "rb") { |f| Zlib.crc32 f.read } #=> 314435800 digest = Digest::CRC32.file('test.rb').digest!.to_i #=> 314435800 crc32 == digest #=> true 

The above will work on all platforms and all rubies .. that I know about .. But this is not what you requested.

I am pretty sure that the hexdigest and digest methods in your example above work as they should, but ..

 dig_file = Digest::CRC32.file('test.rb') test1 = dig_file.hexdigest #=> "333134343335383030" test2 = dig_file.digest #=> "314435800" def hexdigest_to_digest(h) h.unpack('a2'*(h.size/2)).collect {|i| i.hex.chr }.join end test3 = hexdigest_to_digest(test1) #=> "314435800" 

So, I assume .to_i.to_s (16) discards the expected result, or your expected result might be wrong? Not sure, but all the best

+3
source

It works just fine, be sure to use the network byte order, for example:

 def finish; [@crc32].pack('N'); end 
+3
source

Source: https://habr.com/ru/post/904264/


All Articles