When will the Python plugin lose accuracy when porting to Protobuf / C ++ float?

I am interested in minimizing the size of a protobuf message serialized from Python.

Protobuf has floats (4 bytes) and doubles (8 bytes). Python has a float type, which is actually a C double, at least in CPython.

My question is: is a Python instance asked float, is there a "quick" way to check if the precision will lose its value if it was assigned to protobuf float(or indeed C ++ float)?

+4
source share
3 answers

float ; , . 6 ( 7 ), 6- ( ), 64- 32- . -126 127:

import math
import re

def is_single_precision(
        f,
        _isfinite=math.isfinite,
        _singlepat=re.compile(
            r'-?0x[01]\.[0-9a-f]{5}[02468ace]0{7}p'
            r'(?:\+(?:1[01]\d|12[0-7]|[1-9]\d|\d)|'
            r'-(?:1[01]\d|12[0-6]|[1-9]\d|\d))$').match):
    return not _isfinite(f) or _singlepat(f.hex()) is not None or f == 0.0

float.hex() , , roundtripping struct numpy; 1 :

>>> timeit.Timer('(1.2345678901e+26).hex()').autorange()
(1000000, 0.47934128501219675)

, 1 float 1,1 :

>>> import random, sys
>>> testvalues = [0.0, float('inf'), float('-inf'), float('nan')] + [random.uniform(sys.float_info.min, sys.float_info.max) for _ in range(2 * 10 ** 6)]
>>> timeit.Timer('is_single_precision(f())', 'from __main__ import is_single_precision, testvalues; f = iter(testvalues).__next__').autorange()
(1000000, 1.1044921400025487)

, binary32 float 23 . 8 (). 23 , 8- .

.

, , ! , , 1/3- 1/10-. , , :

>>> (1/3).hex()
'0x1.5555555555555p-2'
>>> (1/10).hex()
'0x1.999999999999ap-4'

, ; (-126, 127), double .

+4

, " ", , numpy, :

import struct, math
def is_single_precision_struct(x, _s=struct.Struct("f")):
    return math.isnan(x) or _s.unpack(_s.pack(x))[0] == x

is_single_precision_numpy():

  • is_single_precision_numpy (f): [2.5650789737701416, 2.5488431453704834, 2.551704168319702]
  • is_single_precision_struct (f): [0.3972139358520508, 0.39684605598449707, 0.39119601249694824]

, .

+2

, , , , NumPy np.float32 , :

import numpy

def is_single_precision_numpy(floatval, _float32=np.float32):
    return _float32(floatval) == floatval

, , float32. :

>>> is_single_precision_numpy(float.fromhex('0x13p-149'))
True
>>> is_single_precision_numpy(float.fromhex('0x13.8p-149'))
False

hex.

, @Martijn Pieters, - ( , regex). ( is_single_precision_re_hex - Martijn).

>>> timeit.Timer('is_single_precision_numpy(f)', 'f = 1.2345678901e+26; from __main__ import is_single_precision_numpy').repeat(3, 10**6)
[2.035495020012604, 2.0115931580075994, 2.013475093001034]
>>> timeit.Timer('is_single_precision_re_hex(f)', 'f = 1.2345678901e+26; from __main__ import is_single_precision_re_hex').repeat(3, 10**6)
[1.1169273109990172, 1.1178153319924604, 1.1184561859990936]

Unfortunately, while almost all angular cases (subnormal values, infinities, signed zeros, overflows, etc.) are handled correctly, there is one angular case where this solution will not work: the case when floatvalNaN is. In this case, is_single_precision_numpywill return False. It may or may not matter to your needs. If that matters, adding an extra check isnanshould do the trick:

import math

def is_single_precision_numpy(floatval, _float32=np.float32, _isnan=math.isnan):
    return _float32(floatval) == floatval or _isnan(floatval)
+1
source

Source: https://habr.com/ru/post/1692883/


All Articles