I try to encode signed values from -256 ↔ 255 (i.e. 9-bit data represented by short) using an arithmetic encoder, however I found that existing arithmetic encoding implementations (such as dlib , rANS ) usually read the file in line form and process data as 8-bit.
The problem with this method is that this separation of the signed data (shown in 3 ) as a line destroys the base histogram (shown in 4 ). I believe that such splitting can also degrade compression ratios (but I could be wrong).
I tested my hypothesis by implementing Huffman encoding with 8-bit and 16-bit data and found that I was right, possibly because of Huffman's dependency on creating a tree using probabilities.
(EDITED). My question is how to encode / simulate characters (which cannot be contained in a regular 8-bit container), so that the resulting characters can be easily compressed using traditional arithmetic compressor implementations without affecting the compression ratios.
Signed Bar Graph:

Divided Bar Graph:

source
share