Inconsistent zlib compression results between Win32 and Linux-64 bit

Using zlib in the program and noticed a one-bit difference in how "foo" compressed on Windows 1F8B080000000000000A4BCBCF07002165738C03000000 and Linux 1F8B08000000000000034BCBCF07002165738C03000000 . Both are decompressed to "foo"

I decided to check outside our code to make sure that the implementation was correct and used the test programs in the zlib repository for double checking. I got the same results:

Linux: echo -n foo| ./minigzip64 > text.txt' echo -n foo| ./minigzip64 > text.txt'

Windows: echo|set /p="foo" | minigzip > text.txt echo|set /p="foo" | minigzip > text.txt

What explains this difference? This is problem?

1F8B 0800 0000 0000 000 *3/A* 4BCB CF07 0021 6573 8C03 0000 00

+5
source share
2 answers

First of all, if it is unpacked to compressed, then this is not a problem. Different compressors or the same compressor with different settings or even the same compressor with the same settings, but with different versions, can create different compressed output from the same input.

Secondly, the compressed data in this case is identical. Only the last byte of the gzip header that precedes the compressed data is different. This byte defines the source operating system. Therefore, this changes correctly between Linux and Windows.

Even on the same operating system, the title may change because it contains the date and time of the change. However, in both cases, the date and time of the change were omitted (set to zero).

+5
source

Just add to the answer accepted here. I became curious and tried it myself, saving the raw data and opening it with 7zip:

Window:

gzip-win

Linux:

gzip-linux

You can immediately notice that the only field that is different from Host OS.

What do the data mean?

 Header Data Footer 1F8B080000000000000A | 4BCBCF0700 | 2165738C03000000 

Let me break it.

Headline

First, from this answer, I understand that this is actually gzip instead of the zlib header:

 Level ZLIB GZIP 1 | 78 01 | 1F 8B 9 | 78 DA | 1F 8B 

Further search led me to an article on gzip in the forensics wiki. Values ​​in this case:

 Offset Size Value Description 0 | 2 | 1f8b | Signature (or identification byte 1 and 2) 2 | 1 | 08 | Compression Method (deflate) 3 | 1 | | Flags 4 | 4 | | Last modification time 8 | 1 | | Compression flags (or extra flags) 9 | 1 | 0A | Operating system (TOPS-20) 

Footer

 Offset Size Value Description 0 | 4 | 2165738C | Checksum (CRC-32) (Little endian) 4 | 4 | 03 | Uncompressed data size Value in bytes. 

It is interesting to note that even if the time of the last modification and the operating system in the header is different, it will be compressed to the same data with the same checksum in the footer.

IETF RFC contains a more detailed format summary

+3
source

Source: https://habr.com/ru/post/1269660/


All Articles