C ++ and RLE for character sequences

I'm having difficulty using RLE for character sequences.

For example, I can do RLE encoding in strings like

"ASSSAAAEERRRRRRRR" 

which will be converted to:

 "A1S3A3E2R8". 

But I would like to do RLE for strings like

 "XXXYYYYY(1ADEFC)(EDCADD)(1ADEFC)(1ADEFC)(1ADEFC)" 

which will be converted to:

 "X3Y5(1ADEFC)1(EDCADD)1(1ADEFC)3" 

Is there any way to achieve this? This task becomes a little easier because long strings always appear in brackets. Could you give advice to do this in C ++?
If there is a better way to preserve values ​​than using parentheses, it would also be great if you recommend me.

0
source share
2 answers

You must break this problem into smaller parts. First, you should have a function that tokens your stream and returns every single part. In this example, the input stream:

 "XXXYYYYY(1ADEFC)(EDCADD)(1ADEFC)(1ADEFC)(1ADEFC)" 

this function will return the following elements: one for each call:

 X X X Y Y Y Y Y (1ADEFC) (EDCADD) (1ADEFC) (1ADEFC) (1ADEFC) <eof> 

If you performed this function correctly, then the RLE algorithm that you have already implemented for single characters should be easily adapted to support longer strings.

+4
source

Since you mention your intention to use RLE to encode data in order to subsequently use gzip compression and achieve better compression, my answer is not to first encode it. gzip compression uses DEFLATE, which is a generalization of string length encoding, which can take advantage of character string strings. You will not get better compression for applying the same algorithm twice, and in fact you may lose a little compression.

If you insist on doing your own RLE, it is better to store the given length instead of using parentheses. That is, instead of (1ADEFC)3 use 61ADEFC3 . Also note that you intend to compress pixels that use the entire range of byte values. Keep this in mind, as an algorithm written to work with strings will not be suitable for raw data with embedded zeros and non-printable characters.

0
source

Source: https://habr.com/ru/post/1443494/


All Articles