I wrote a program to do run length coding. Typically, if the text
AAAAAABBCDEEEEGGHJ
run length will make it
A6B2C1D1E4G2H1J1
but he added an extra 1 for each non-repeating character. Since I am compressing BMP files with it, I went with the idea of placing a “$” token to indicate the occurrence of a repeating character (assuming that the image files have a huge amount of repeating text).
So it will look like
$A6$B2CD$E4$G2HJ
In the current example, the length is the same, but there is a noticeable difference for BMP files. Now my problem is decryption. It so happened that some BMP files have an $<char><num>ie template $I9in the source file, so I would also contain the same text in the compressed file. $I9however, when decoding, it will consider it as a repeating I that repeats 9 times! Thus, it produces the wrong conclusion. I want to know which character I can use to mark the beginning of a repeating character (run) so that it does not contradict the original source.
Anirudh goel
source
share