What is the best compression scheme for small data like 1.66kBytes?

This data is stored in an array (using C ++) and is a repetition of 125 bits, each of which is different from the other. It also has 8 messages of 12 ASCII characters at the end. Please suggest whether to use differential compression in an array, and if so, how?

Or should I apply some other compression scheme to the whole array?

+4
source share
2 answers

Typically, you can compress data that has some predictability or redundancy. Dictionary-based compression (for example, ZIP-style algorithms) traditionally does not work on small pieces of data due to the need to share the selected dictionary.

In the past, when I compressed very small pieces of data with somewhat predictable patterns, I used SharpZipLib with a custom Dictionary. Instead of embedding the dictionary in the actual data, I hard-coded the dictionary in every program that should (de) compress the data. SharpZipLib provides you with both options: a custom dictionary and keeps the dictionary separate from the data.

Again, this will only work if you can predict some patterns of your data in advance so that you can create an appropriate compression dictionary, and the dictionary itself can be separated from the compressed data.

+3
source

You have not provided us with enough information to help you. However, I can highly recommend the book Compression Text by Bell, Cleary and Witten. Do not be fooled by the title; โ€œTextโ€ here simply means โ€œlosslessโ€ - all methods are applicable to binary data. Since the book is expensive, you can try to get it in the interlibrary loan.

Also, do not overlook the obvious methods of Burrows-Wheeler (bzip2) or Lempel-Ziv (gzip, zlib). It is possible that one of these methods will work well for your application, so before exploring the alternatives, try compressing your data with standard tools .

0
source

Source: https://habr.com/ru/post/1310680/


All Articles