Maximum density of numerical information with printf

Question

Maximum density of numerical information with printf

My use case writes numbers to a JSON document in which minimizing size is more important than precision of very small / large numbers. Numbers are usually common units, such as milliseconds or meters, which typically fall in the range [0.001.1000].

Essentially, I would like to set the maximum character length. For example, if the limit was five characters, then:

from to 1234567 123e4 12345.6 12346 1234.56 1235 123.456 123.5 12.3456 12.35 1.23456 1.235 1.23450 1.235 1.23400 1.234 1.23000 1.23 1.20000 1.2 1.00000 1 0.11111 0.111 0.01111 0.011 0.00111 0.001 0.00011 11e-4 0.00001 1e-5 0.11111 0.111 0.01111 0.011 0.00111 0.001 0.00011 11e-4 0.00001 1e-5

This test case seems to convey a lot of information within a length limit.

It fails with numbers raised to values outside the range [-99,999], and this range will vary depending on the constraint imposed. Perhaps the reason for the failure is simply to write a longer line in these rare cases.

This is ideal, although I could live without doing it myself, if the other solution is relatively close, it may be truncated instead of rounding and not use scientific / expressed notation.

EDIT here that printf with %.3f , %.3g , %.4g produces compared ( code here ):

 printf("%.3f"); match 0 - 1.23457e+06 -> 1234567.000 expected 12e5 match 0 - 12345.6 -> 12345.600 expected 12346 match 0 - 1234.56 -> 1234.560 expected 1235 match 0 - 123.456 -> 123.456 expected 123.5 match 0 - 12.3456 -> 12.346 expected 12.35 match 1 - 1.23456 -> 1.235 match 0 - 1.2345 -> 1.234 expected 1.235 match 1 - 1.234 -> 1.234 match 0 - 1.23 -> 1.230 expected 1.23 match 0 - 1.2 -> 1.200 expected 1.2 match 0 - 1 -> 1.000 expected 1 match 1 - 0.11111 -> 0.111 match 1 - 0.01111 -> 0.011 match 1 - 0.00111 -> 0.001 match 0 - 0.00011 -> 0.000 expected 11e-4 match 0 - 1e-05 -> 0.000 expected 1e-5 match 1 - 0.11111 -> 0.111 match 1 - 0.01111 -> 0.011 match 1 - 0.00111 -> 0.001 match 0 - 0.00011 -> 0.000 expected 11e-4 match 0 - 1e-05 -> 0.000 expected 1e-5 printf("%.3g"); match 0 - 1.23457e+06 -> 1.23e+06 expected 12e5 match 0 - 12345.6 -> 1.23e+04 expected 12346 match 0 - 1234.56 -> 1.23e+03 expected 1235 match 0 - 123.456 -> 123 expected 123.5 match 0 - 12.3456 -> 12.3 expected 12.35 match 0 - 1.23456 -> 1.23 expected 1.235 match 0 - 1.2345 -> 1.23 expected 1.235 match 0 - 1.234 -> 1.23 expected 1.234 match 1 - 1.23 -> 1.23 match 1 - 1.2 -> 1.2 match 1 - 1 -> 1 match 1 - 0.11111 -> 0.111 match 0 - 0.01111 -> 0.0111 expected 0.011 match 0 - 0.00111 -> 0.00111 expected 0.001 match 0 - 0.00011 -> 0.00011 expected 11e-4 match 0 - 1e-05 -> 1e-05 expected 1e-5 match 1 - 0.11111 -> 0.111 match 0 - 0.01111 -> 0.0111 expected 0.011 match 0 - 0.00111 -> 0.00111 expected 0.001 match 0 - 0.00011 -> 0.00011 expected 11e-4 match 0 - 1e-05 -> 1e-05 expected 1e-5 printf("%.4g"); match 0 -> 1.23457e+06 -> 1.235e+06 expected 12e5 match 0 -> 12345.6 -> 1.235e+04 expected 12346 match 1 -> 1234.56 -> 1235 match 1 -> 123.456 -> 123.5 match 1 -> 12.3456 -> 12.35 match 1 -> 1.23456 -> 1.235 match 0 -> 1.2345 -> 1.234 expected 1.235 match 1 -> 1.234 -> 1.234 match 1 -> 1.23 -> 1.23 match 1 -> 1.2 -> 1.2 match 1 -> 1 -> 1 match 0 -> 0.11111 -> 0.1111 expected 0.111 match 0 -> 0.01111 -> 0.01111 expected 0.011 match 0 -> 0.00111 -> 0.00111 expected 0.001 match 0 -> 0.00011 -> 0.00011 expected 11e-4 match 0 -> 1e-05 -> 1e-05 expected 1e-5 match 0 -> 0.11111 -> 0.1111 expected 0.111 match 0 -> 0.01111 -> 0.01111 expected 0.011 match 0 -> 0.00111 -> 0.00111 expected 0.001 match 0 -> 0.00011 -> 0.00011 expected 11e-4 match 0 -> 1e-05 -> 1e-05 expected 1e-5

+5

json c number-formatting

Drew noakes Nov 02 '14 at 13:16

source share

2 answers

Brendan · Answer 1 · 2014-11-02T20:55:09+0000

For a packing number in a certain range to the smallest unsigned integer:

1) Subtract the smallest possible value. For example, if your numbers can vary from 0.001 to 100000, and a specific number is 123.456, then subtract 0.001 to get 123.455

2) Divide the accuracy you care about. For example, if you care about thousandths, divide them by 0.001. In this case, the number 123.455 becomes 123455

After you have done this and get the minimum unsigned integer, convert it to hexadecimal digits (or possibly “base 32 digits”). In the above example, 0.001 will become 0x00000000, 123.456 will become 0x0001E23F, and 100000 will become 0x05F5E0FF.

If you want "variable precision", you can add a third step that divides the unsigned integer into the "value and shift" form. For instance:

  shift_count = 0; while(value > 0xFFF) { value = value >> 1; shift_count++; }

Then you can combine something like value = (value << 4) | shift_count value = (value << 4) | shift_count .

This way you can compress your numbers to 4 hexadecimal digits. For the above examples, 0.001 will become 0x0000 (exactly equal to 0.001), 123.456 will become 0xF115 (actually represents 123.425), and 100000 will become 0xBEBF (actually represents 99975.169).

nwellnhof · Answer 2 · 2014-11-03T00:18:11+0000

It seems that you need to write your own conversion procedure. The ecvt function may help.

But I would just use the format %.3g or %.4g , separate the extra plus sign and the beginning of zeros in front of the exponent and call it day. This basically leaves some decimal points that could be optimized. Since you are so concerned about the size of your JSON response, you are likely to use HTTP compression, so I doubt it will cause a lot of overhead.

Maximum density of numerical information with printf

More articles: